OFBiz › OFBiz - Dev

ofbiz wiki(confluence)

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

16 messages Options

Adam Heath-2

ofbiz wiki(confluence)

So, I need some admin help with cwiki.apache.org, or at least advice.
I've got a script that uses xmlrpc to confluence, and fetches all
previous page(+versions), comments, attachments(+versions), tracks
renames, usernames, and commit messages. I then take all this data
and convert it into a long series of git commits, with the files layed
out in a proper webslinger design. The author of each git commit is
the person who changed the page, added a comment, or uploaded a new
attachment.

The issue I am having is the confluence installed on cwiki is old.
Newer versions support returning the PageHistorySummary.versionComment
thru the rpc; currently, I have to fall back and do a screen scrape of
the viewpreviousversions.action page.

Where should I ask for help on this, getting this new api implemented?

I also have suggestions to make the api more lightweight, when doing
incremental updates(which my system supports).

As a side note, there is a severe lack of version comments. This
script ends up producing 3117 commits. Some of those are page
renames/comments/attachments, which don't have a commit message. Most
are page commits. There are only 70 change messages. It'd be nice if
people would comment when they change a page, but I don't see a way to
enforce that.

Jacques Le Roux

Re: ofbiz wiki(confluence)

Administrator

From: "Adam Heath" <[hidden email]>

> So, I need some admin help with cwiki.apache.org, or at least advice.
> I've got a script that uses xmlrpc to confluence, and fetches all
> previous page(+versions), comments, attachments(+versions), tracks
> renames, usernames, and commit messages. I then take all this data
> and convert it into a long series of git commits, with the files layed
> out in a proper webslinger design. The author of each git commit is
> the person who changed the page, added a comment, or uploaded a new
> attachment.
>
> The issue I am having is the confluence installed on cwiki is old.
> Newer versions support returning the PageHistorySummary.versionComment
> thru the rpc; currently, I have to fall back and do a screen scrape of
> the viewpreviousversions.action page.
>
> Where should I ask for help on this, getting this new api implemented?

infra team: [hidden email]

I put them in copy

Jacques

> I also have suggestions to make the api more lightweight, when doing
> incremental updates(which my system supports).
>
> As a side note, there is a severe lack of version comments. This
> script ends up producing 3117 commits. Some of those are page
> renames/comments/attachments, which don't have a commit message. Most
> are page commits. There are only 70 change messages. It'd be nice if
> people would comment when they change a page, but I don't see a way to
> enforce that.
>

Adam Heath-2

Re: ofbiz wiki(confluence)

On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
> From: "Adam Heath" <[hidden email]>
>> So, I need some admin help with cwiki.apache.org, or at least advice.
>> I've got a script that uses xmlrpc to confluence, and fetches all
>> previous page(+versions), comments, attachments(+versions), tracks
>> renames, usernames, and commit messages. I then take all this data and
>> convert it into a long series of git commits, with the files layed out
>> in a proper webslinger design. The author of each git commit is the
>> person who changed the page, added a comment, or uploaded a new
>> attachment.

This webslinger layout is still in flux, as is my script. The basic
logic works, however, by fetching all meta data, storing most of the
bulk of that in a temporary cache folder(only for the duration of the
script), then sorting each item by date, and replaying the set of
changes one by one.

It's optimized by storing the 'lastFoo' stuff for each
page/comment/attachment/(title->pageId mapping) as needed, so that it
can detect newer versions, etc, and not have to do anything. A
refresh after a full download against the OFBIZ space takes 2 minutes,
with nothing new to fetch.

>> The issue I am having is the confluence installed on cwiki is old.
>> Newer versions support returning the PageHistorySummary.versionComment
>> thru the rpc; currently, I have to fall back and do a screen scrape of
>> the viewpreviousversions.action page.

CONFDEV docs definately list a versionComment field on
PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.

>> Where should I ask for help on this, getting this new api implemented?
>
> infra team: [hidden email]
>
> I put them in copy

Thanks. I'm putting more information in this email; I've left
dev@ofbiz on the cc for this email, as others might be interested in
what I have discovered.

>> I also have suggestions to make the api more lightweight, when doing
>> incremental updates(which my system supports).

Here are the suggestions:

I can fetch all attachments for a page. But the attachment data
returned doesn't include the current version as a field. I have to
split the download url(which is sub-optimal; it includes the current
version as a parameter). It might be nice to have an
AttachmentSummary type record.

What if uploads an attachment, then a new version of the attachment,
then changes the page, then deletes the attachment? How could I fetch
that information? I don't see a way to fetch all attachments for all
time against a particular history. This is also a problem for deleted
pages, comments, and labels(probably others).

Comments in confluence support editting. Is this history stored, and
if so, can I get access to it?

Are labels versioned?

Children of pages are versioned, only because pages themselves are
versioned, which includes the value of the parentId at the time the
page was changed. However, the frontend doesn't let you see older
children, when looking at previous versions.

It'd be nice if when calling getPageHistory, I could request a subset
of the list, instead of *all* page versions. If a page has 271
versions, and I have already fetched them, and the current page has a
version of 274, then I only really need to fetch 3 PageHistorySummary
records(to get the versionComment from newer versions of confluence).

BlogEntrySummary doesn't include version, but BlogEntry does. And I
can't fetch old versions of blogs.

>>
>> As a side note, there is a severe lack of version comments. This
>> script ends up producing 3117 commits. Some of those are page
>> renames/comments/attachments, which don't have a commit message. Most
>> are page commits. There are only 70 change messages. It'd be nice if
>> people would comment when they change a page, but I don't see a way to
>> enforce that.
>>
>

Jacques Le Roux

Re: ofbiz wiki(confluence)

Administrator

From: "Adam Heath" <[hidden email]>

> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
>> From: "Adam Heath" <[hidden email]>
>>> So, I need some admin help with cwiki.apache.org, or at least advice.
>>> I've got a script that uses xmlrpc to confluence, and fetches all
>>> previous page(+versions), comments, attachments(+versions), tracks
>>> renames, usernames, and commit messages. I then take all this data and
>>> convert it into a long series of git commits, with the files layed out
>>> in a proper webslinger design. The author of each git commit is the
>>> person who changed the page, added a comment, or uploaded a new
>>> attachment.
>
> This webslinger layout is still in flux, as is my script. The basic logic works, however, by fetching all meta data, storing most
> of the bulk of that in a temporary cache folder(only for the duration of the script), then sorting each item by date, and
> replaying the set of changes one by one.
>
> It's optimized by storing the 'lastFoo' stuff for each page/comment/attachment/(title->pageId mapping) as needed, so that it can
> detect newer versions, etc, and not have to do anything. A refresh after a full download against the OFBIZ space takes 2 minutes,
> with nothing new to fetch.
>
>>> The issue I am having is the confluence installed on cwiki is old.
>>> Newer versions support returning the PageHistorySummary.versionComment
>>> thru the rpc; currently, I have to fall back and do a screen scrape of
>>> the viewpreviousversions.action page.
>
> CONFDEV docs definately list a versionComment field on PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
>
>>> Where should I ask for help on this, getting this new api implemented?
>>
>> infra team: [hidden email]
>>
>> I put them in copy
>
> Thanks. I'm putting more information in this email; I've left dev@ofbiz on the cc for this email, as others might be interested
> in what I have discovered.
>
>>> I also have suggestions to make the api more lightweight, when doing
>>> incremental updates(which my system supports).
>
> Here are the suggestions:
>
> I can fetch all attachments for a page. But the attachment data returned doesn't include the current version as a field. I have
> to split the download url(which is sub-optimal; it includes the current version as a parameter). It might be nice to have an
> AttachmentSummary type record.
>
> What if uploads an attachment, then a new version of the attachment, then changes the page, then deletes the attachment? How
> could I fetch that information? I don't see a way to fetch all attachments for all time against a particular history. This is
> also a problem for deleted pages, comments, and labels(probably others).
>
> Comments in confluence support editting. Is this history stored, and if so, can I get access to it?
>
> Are labels versioned?
>
> Children of pages are versioned, only because pages themselves are versioned, which includes the value of the parentId at the time
> the page was changed. However, the frontend doesn't let you see older children, when looking at previous versions.
>
> It'd be nice if when calling getPageHistory, I could request a subset of the list, instead of *all* page versions. If a page has
> 271 versions, and I have already fetched them, and the current page has a version of 274, then I only really need to fetch 3
> PageHistorySummary records(to get the versionComment from newer versions of confluence).
>
> BlogEntrySummary doesn't include version, but BlogEntry does. And I can't fetch old versions of blogs.
>
>
>>>
>>> As a side note, there is a severe lack of version comments. This
>>> script ends up producing 3117 commits. Some of those are page
>>> renames/comments/attachments, which don't have a commit message. Most
>>> are page commits. There are only 70 change messages. It'd be nice if
>>> people would comment when they change a page, but I don't see a way to
>>> enforce that.
>>>
>>

Adam,

There is currently a beginning effort to create a CMS for apache.org (infrastructure/trunk/projects/cms) is yours related to this
effort?

Jacques
PS: Not sure how to access to infrastructure/trunk/projects/cms/README with the rights I have

Adam Heath-2

Re: ofbiz wiki(confluence)

On 09/21/2010 02:07 PM, Jacques Le Roux wrote:

> From: "Adam Heath" <[hidden email]>
>> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
>>> From: "Adam Heath" <[hidden email]>
>>>> So, I need some admin help with cwiki.apache.org, or at least advice.
>>>> I've got a script that uses xmlrpc to confluence, and fetches all
>>>> previous page(+versions), comments, attachments(+versions), tracks
>>>> renames, usernames, and commit messages. I then take all this data and
>>>> convert it into a long series of git commits, with the files layed out
>>>> in a proper webslinger design. The author of each git commit is the
>>>> person who changed the page, added a comment, or uploaded a new
>>>> attachment.
>>
>> This webslinger layout is still in flux, as is my script. The basic
>> logic works, however, by fetching all meta data, storing most of the
>> bulk of that in a temporary cache folder(only for the duration of the
>> script), then sorting each item by date, and replaying the set of
>> changes one by one.
>>
>> It's optimized by storing the 'lastFoo' stuff for each
>> page/comment/attachment/(title->pageId mapping) as needed, so that it
>> can detect newer versions, etc, and not have to do anything. A refresh
>> after a full download against the OFBIZ space takes 2 minutes, with
>> nothing new to fetch.
>>
>>>> The issue I am having is the confluence installed on cwiki is old.
>>>> Newer versions support returning the PageHistorySummary.versionComment
>>>> thru the rpc; currently, I have to fall back and do a screen scrape of
>>>> the viewpreviousversions.action page.
>>
>> CONFDEV docs definately list a versionComment field on
>> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
>>
>>>> Where should I ask for help on this, getting this new api implemented?
>>>
>>> infra team: [hidden email]
>>>
>>> I put them in copy
>>
>> Thanks. I'm putting more information in this email; I've left
>> dev@ofbiz on the cc for this email, as others might be interested in
>> what I have discovered.
>>
>>>> I also have suggestions to make the api more lightweight, when doing
>>>> incremental updates(which my system supports).
>>
>> Here are the suggestions:
>>
>> I can fetch all attachments for a page. But the attachment data
>> returned doesn't include the current version as a field. I have to
>> split the download url(which is sub-optimal; it includes the current
>> version as a parameter). It might be nice to have an AttachmentSummary
>> type record.
>>
>> What if uploads an attachment, then a new version of the attachment,
>> then changes the page, then deletes the attachment? How could I fetch
>> that information? I don't see a way to fetch all attachments for all
>> time against a particular history. This is also a problem for deleted
>> pages, comments, and labels(probably others).
>>
>> Comments in confluence support editting. Is this history stored, and
>> if so, can I get access to it?
>>
>> Are labels versioned?
>>
>> Children of pages are versioned, only because pages themselves are
>> versioned, which includes the value of the parentId at the time the
>> page was changed. However, the frontend doesn't let you see older
>> children, when looking at previous versions.
>>
>> It'd be nice if when calling getPageHistory, I could request a subset
>> of the list, instead of *all* page versions. If a page has 271
>> versions, and I have already fetched them, and the current page has a
>> version of 274, then I only really need to fetch 3 PageHistorySummary
>> records(to get the versionComment from newer versions of confluence).
>>
>> BlogEntrySummary doesn't include version, but BlogEntry does. And I
>> can't fetch old versions of blogs.
>>
>>
>>>>
>>>> As a side note, there is a severe lack of version comments. This
>>>> script ends up producing 3117 commits. Some of those are page
>>>> renames/comments/attachments, which don't have a commit message. Most
>>>> are page commits. There are only 70 change messages. It'd be nice if
>>>> people would comment when they change a page, but I don't see a way to
>>>> enforce that.
>>>>
>>>
>
> Adam,
>
> There is currently a beginning effort to create a CMS for apache.org
> (infrastructure/trunk/projects/cms) is yours related to this effort?

No, it's not. Based on how much time I've spent already(started my
imoprter last friday), and how familiar I am with ofbiz, it'd probably
take me a 2 months to get mostly feature compatible with
confluence(that's for a single person working in his spare time).

> Jacques
> PS: Not sure how to access to infrastructure/trunk/projects/cms/README
> with the rights I have

You mean it's not public?

Joe Schaefer

Re: ofbiz wiki(confluence)

The url is here
https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
and is publicly readable.

----- Original Message ----

> From: Adam Heath <[hidden email]>
> To: Jacques Le Roux <[hidden email]>
> Cc: [hidden email]; [hidden email]
> Sent: Tue, September 21, 2010 3:42:35 PM
> Subject: Re: ofbiz wiki(confluence)
>
> On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
> > From: "Adam Heath" <[hidden email]>
> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
> >>> From: "Adam Heath" <[hidden email]>
> >>>> So, I need some admin help with cwiki.apache.org, or at least advice.
> >>>> I've got a script that uses xmlrpc to confluence, and fetches all
> >>>> previous page(+versions), comments, attachments(+versions), tracks
> >>>> renames, usernames, and commit messages. I then take all this data and
> >>>> convert it into a long series of git commits, with the files layed out
> >>>> in a proper webslinger design. The author of each git commit is the
> >>>> person who changed the page, added a comment, or uploaded a new
> >>>> attachment.
> >>
> >> This webslinger layout is still in flux, as is my script. The basic
> >> logic works, however, by fetching all meta data, storing most of the
> >> bulk of that in a temporary cache folder(only for the duration of the
> >> script), then sorting each item by date, and replaying the set of
> >> changes one by one.
> >>
> >> It's optimized by storing the 'lastFoo' stuff for each
> >> page/comment/attachment/(title->pageId mapping) as needed, so that it
> >> can detect newer versions, etc, and not have to do anything. A refresh
> >> after a full download against the OFBIZ space takes 2 minutes, with
> >> nothing new to fetch.
> >>
> >>>> The issue I am having is the confluence installed on cwiki is old.
> >>>> Newer versions support returning the PageHistorySummary.versionComment
> >>>> thru the rpc; currently, I have to fall back and do a screen scrape of
> >>>> the viewpreviousversions.action page.
> >>
> >> CONFDEV docs definately list a versionComment field on
> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
> >>
> >>>> Where should I ask for help on this, getting this new api implemented?
> >>>
> >>> infra team: [hidden email]
> >>>
> >>> I put them in copy
> >>
> >> Thanks. I'm putting more information in this email; I've left
> >> dev@ofbiz on the cc for this email, as others might be interested in
> >> what I have discovered.
> >>
> >>>> I also have suggestions to make the api more lightweight, when doing
> >>>> incremental updates(which my system supports).
> >>
> >> Here are the suggestions:
> >>
> >> I can fetch all attachments for a page. But the attachment data
> >> returned doesn't include the current version as a field. I have to
> >> split the download url(which is sub-optimal; it includes the current
> >> version as a parameter). It might be nice to have an AttachmentSummary
> >> type record.
> >>
> >> What if uploads an attachment, then a new version of the attachment,
> >> then changes the page, then deletes the attachment? How could I fetch
> >> that information? I don't see a way to fetch all attachments for all
> >> time against a particular history. This is also a problem for deleted
> >> pages, comments, and labels(probably others).
> >>
> >> Comments in confluence support editting. Is this history stored, and
> >> if so, can I get access to it?
> >>
> >> Are labels versioned?
> >>
> >> Children of pages are versioned, only because pages themselves are
> >> versioned, which includes the value of the parentId at the time the
> >> page was changed. However, the frontend doesn't let you see older
> >> children, when looking at previous versions.
> >>
> >> It'd be nice if when calling getPageHistory, I could request a subset
> >> of the list, instead of *all* page versions. If a page has 271
> >> versions, and I have already fetched them, and the current page has a
> >> version of 274, then I only really need to fetch 3 PageHistorySummary
> >> records(to get the versionComment from newer versions of confluence).
> >>
> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I
> >> can't fetch old versions of blogs.
> >>
> >>
> >>>>
> >>>> As a side note, there is a severe lack of version comments. This
> >>>> script ends up producing 3117 commits. Some of those are page
> >>>> renames/comments/attachments, which don't have a commit message. Most
> >>>> are page commits. There are only 70 change messages. It'd be nice if
> >>>> people would comment when they change a page, but I don't see a way to
> >>>> enforce that.
> >>>>
> >>>
> >
> > Adam,
> >
> > There is currently a beginning effort to create a CMS for apache.org
> > (infrastructure/trunk/projects/cms) is yours related to this effort?
>
> No, it's not. Based on how much time I've spent already(started my
> imoprter last friday), and how familiar I am with ofbiz, it'd probably
> take me a 2 months to get mostly feature compatible with
> confluence(that's for a single person working in his spare time).
>
> > Jacques
> > PS: Not sure how to access to infrastructure/trunk/projects/cms/README
> > with the rights I have
>
> You mean it's not public?
>

Jacques Le Roux

Re: ofbiz wiki(confluence)

Administrator

Thanks Joe,

I quickly tried through Subclipse and got an error.
I guess now Adam has a better idea of what I was talking about.
I mean maybe Webslinger could be used, just my 2 cts...

Jacques

From: "Joe Schaefer" <[hidden email]>

> The url is here
> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
> and is publicly readable.
>
>
>
> ----- Original Message ----
>> From: Adam Heath <[hidden email]>
>> To: Jacques Le Roux <[hidden email]>
>> Cc: [hidden email]; [hidden email]
>> Sent: Tue, September 21, 2010 3:42:35 PM
>> Subject: Re: ofbiz wiki(confluence)
>>
>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
>> > From: "Adam Heath" <[hidden email]>
>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
>> >>> From: "Adam Heath" <[hidden email]>
>> >>>> So, I need some admin help with cwiki.apache.org, or at least advice.
>> >>>> I've got a script that uses xmlrpc to confluence, and fetches all
>> >>>> previous page(+versions), comments, attachments(+versions), tracks
>> >>>> renames, usernames, and commit messages. I then take all this data and
>> >>>> convert it into a long series of git commits, with the files layed out
>> >>>> in a proper webslinger design. The author of each git commit is the
>> >>>> person who changed the page, added a comment, or uploaded a new
>> >>>> attachment.
>> >>
>> >> This webslinger layout is still in flux, as is my script. The basic
>> >> logic works, however, by fetching all meta data, storing most of the
>> >> bulk of that in a temporary cache folder(only for the duration of the
>> >> script), then sorting each item by date, and replaying the set of
>> >> changes one by one.
>> >>
>> >> It's optimized by storing the 'lastFoo' stuff for each
>> >> page/comment/attachment/(title->pageId mapping) as needed, so that it
>> >> can detect newer versions, etc, and not have to do anything. A refresh
>> >> after a full download against the OFBIZ space takes 2 minutes, with
>> >> nothing new to fetch.
>> >>
>> >>>> The issue I am having is the confluence installed on cwiki is old.
>> >>>> Newer versions support returning the PageHistorySummary.versionComment
>> >>>> thru the rpc; currently, I have to fall back and do a screen scrape of
>> >>>> the viewpreviousversions.action page.
>> >>
>> >> CONFDEV docs definately list a versionComment field on
>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
>> >>
>> >>>> Where should I ask for help on this, getting this new api implemented?
>> >>>
>> >>> infra team: [hidden email]
>> >>>
>> >>> I put them in copy
>> >>
>> >> Thanks. I'm putting more information in this email; I've left
>> >> dev@ofbiz on the cc for this email, as others might be interested in
>> >> what I have discovered.
>> >>
>> >>>> I also have suggestions to make the api more lightweight, when doing
>> >>>> incremental updates(which my system supports).
>> >>
>> >> Here are the suggestions:
>> >>
>> >> I can fetch all attachments for a page. But the attachment data
>> >> returned doesn't include the current version as a field. I have to
>> >> split the download url(which is sub-optimal; it includes the current
>> >> version as a parameter). It might be nice to have an AttachmentSummary
>> >> type record.
>> >>
>> >> What if uploads an attachment, then a new version of the attachment,
>> >> then changes the page, then deletes the attachment? How could I fetch
>> >> that information? I don't see a way to fetch all attachments for all
>> >> time against a particular history. This is also a problem for deleted
>> >> pages, comments, and labels(probably others).
>> >>
>> >> Comments in confluence support editting. Is this history stored, and
>> >> if so, can I get access to it?
>> >>
>> >> Are labels versioned?
>> >>
>> >> Children of pages are versioned, only because pages themselves are
>> >> versioned, which includes the value of the parentId at the time the
>> >> page was changed. However, the frontend doesn't let you see older
>> >> children, when looking at previous versions.
>> >>
>> >> It'd be nice if when calling getPageHistory, I could request a subset
>> >> of the list, instead of *all* page versions. If a page has 271
>> >> versions, and I have already fetched them, and the current page has a
>> >> version of 274, then I only really need to fetch 3 PageHistorySummary
>> >> records(to get the versionComment from newer versions of confluence).
>> >>
>> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I
>> >> can't fetch old versions of blogs.
>> >>
>> >>
>> >>>>
>> >>>> As a side note, there is a severe lack of version comments. This
>> >>>> script ends up producing 3117 commits. Some of those are page
>> >>>> renames/comments/attachments, which don't have a commit message. Most
>> >>>> are page commits. There are only 70 change messages. It'd be nice if
>> >>>> people would comment when they change a page, but I don't see a way to
>> >>>> enforce that.
>> >>>>
>> >>>
>> >
>> > Adam,
>> >
>> > There is currently a beginning effort to create a CMS for apache.org
>> > (infrastructure/trunk/projects/cms) is yours related to this effort?
>>
>> No, it's not. Based on how much time I've spent already(started my
>> imoprter last friday), and how familiar I am with ofbiz, it'd probably
>> take me a 2 months to get mostly feature compatible with
>> confluence(that's for a single person working in his spare time).
>>
>> > Jacques
>> > PS: Not sure how to access to infrastructure/trunk/projects/cms/README
>> > with the rights I have
>>
>> You mean it's not public?
>>
>
>
>
>

Adam Heath-2

Re: ofbiz wiki(confluence)

On 09/21/2010 03:53 PM, Jacques Le Roux wrote:
> Thanks Joe,
>
> I quickly tried through Subclipse and got an error.
> I guess now Adam has a better idea of what I was talking about.
> I mean maybe Webslinger could be used, just my 2 cts...

I will attempt to describe webslinger for those who haven't ever heard
of it before.

The major features(bullet points) of webslinger-core are:

* Content data stored as raw files. This is to allow normal programs,
like grep, find, vim, dreamweaver, photoshop, git, svn work without
modifications.

* Makes use of commons-vfs, and a custom set of layered filesystems.

* One layered filesystem is called 'flat'. Arbitrary attributes
(FileObject.getContent().getAttribute(name)) are stored as
path/to/file@, into separate files. Again, this allows for easy
integration with other systems.

* Another layered filesystem is called 'wsvfs'. This is an
overlay/cow type filesystem, where multiple real filesystems are
combined on the fly, giving merged directory listings, with support
for up-copy and whiteout. Any point of the tree can 'overlay' any
other part of the tree, altho this feature isn't normally nescessary.

* Automatic extension resolution. This allows for pretty urls that
don't have extensions, and allow the implementation on the server to
be changed as nescessary. End-users have problems with extensions, so
that is hidden.

* Any 'path' can be configured to do it's own sub-path management.
This allows for nice urls like /shop/product/$productId/detail and
/shop/cart/add/$productId and /Login/Path/To/Protected/Page. These
urls then show up nicely in hit reports. They are also easier for
end-users to remember.

* Automatic attribute inheritance. Extensions are used to find the
mime-type of a file. Or the mime-type can be set directly on the
file. Then, any attribute files set in
/WEB-INF/DefaultMimeAttributes/$mime/$type are inherited for the
resource in question. This allows mapping all ${page}.cf to
application/x-server-side-confluence, creating an attribute called
'type' with a value of 'confluence-page'. More on the types in a bit.

* Every resource has a type, and a handler. Standard types are jsp,
cgi, binary. Base types are event(bsf-based), code, template. Type
can also be servlet, or, even more advanced(but not ready to be
released) is 'vaadin' as a type.

* Several languages are integrated: template:
freemarker/velocity/text, bsf+code:
groovy/janino(java)/jython/rhino/bsh/quercus(php).

* Macros called by a template language can be implemented in *any*
webslinger resource(any type, any language). Each integrated template
type has proxies implemented that allow it to call back into
webslinger macros. velocity-> #Merge("/path/to/file",
"/template/to/wrap/it/with"), freemarker-> <@Merge
path="/path/to-file" template0="/template/to/wrap/it/with"/>. Support
for macros with content bodies is fully supported as well.

* Support for one-type 'wrapper' of a text output, and then different
page styles. Partial-ajax page updates can then skip this, and do
smart updates of regions of the browser.

The above list is an non-inclusive list of features in
webslinger-core. It's really generic, and not tied to any particular
implementation.

The other major thing different about it, is that webslinger is
*itself* a servlet container, just like catalina or glashfish.
However, what sets it apart from all others, is that it doesn't run
standalone; instead, it is installed into a parent container. It then
fakes/wraps everything, to support it's fancy stuff. It supports
running standard servlets, but then get backed by commons-vfs, with
overlay support, etc. This implementation isn't perfect, and really
needs to be improved upon.

I've been working on a demo for the ofbiz community to play with.
However, the existing embedded site in the repository was rather
small, so I wrote an importer to pull stuff from cwiki, which is what
then started this thread.

ps: the license on all our code is asl 2.0

>
> Jacques
>
> From: "Joe Schaefer" <[hidden email]>
>> The url is here
>> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
>> and is publicly readable.
>>
>>
>>
>> ----- Original Message ----
>>> From: Adam Heath <[hidden email]>
>>> To: Jacques Le Roux <[hidden email]>
>>> Cc: [hidden email]; [hidden email]
>>> Sent: Tue, September 21, 2010 3:42:35 PM
>>> Subject: Re: ofbiz wiki(confluence)
>>>
>>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
>>> > From: "Adam Heath" <[hidden email]>
>>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
>>> >>> From: "Adam Heath" <[hidden email]>
>>> >>>> So, I need some admin help with cwiki.apache.org, or at least
>>> advice.
>>> >>>> I've got a script that uses xmlrpc to confluence, and fetches all
>>> >>>> previous page(+versions), comments, attachments(+versions), tracks
>>> >>>> renames, usernames, and commit messages. I then take all this
>>> data and
>>> >>>> convert it into a long series of git commits, with the files
>>> layed out
>>> >>>> in a proper webslinger design. The author of each git commit is the
>>> >>>> person who changed the page, added a comment, or uploaded a new
>>> >>>> attachment.
>>> >>
>>> >> This webslinger layout is still in flux, as is my script. The basic
>>> >> logic works, however, by fetching all meta data, storing most of the
>>> >> bulk of that in a temporary cache folder(only for the duration of the
>>> >> script), then sorting each item by date, and replaying the set of
>>> >> changes one by one.
>>> >>
>>> >> It's optimized by storing the 'lastFoo' stuff for each
>>> >> page/comment/attachment/(title->pageId mapping) as needed, so that it
>>> >> can detect newer versions, etc, and not have to do anything. A
>>> refresh
>>> >> after a full download against the OFBIZ space takes 2 minutes, with
>>> >> nothing new to fetch.
>>> >>
>>> >>>> The issue I am having is the confluence installed on cwiki is old.
>>> >>>> Newer versions support returning the
>>> PageHistorySummary.versionComment
>>> >>>> thru the rpc; currently, I have to fall back and do a screen
>>> scrape of
>>> >>>> the viewpreviousversions.action page.
>>> >>
>>> >> CONFDEV docs definately list a versionComment field on
>>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
>>> >>
>>> >>>> Where should I ask for help on this, getting this new api
>>> implemented?
>>> >>>
>>> >>> infra team: [hidden email]
>>> >>>
>>> >>> I put them in copy
>>> >>
>>> >> Thanks. I'm putting more information in this email; I've left
>>> >> dev@ofbiz on the cc for this email, as others might be interested in
>>> >> what I have discovered.
>>> >>
>>> >>>> I also have suggestions to make the api more lightweight, when
>>> doing
>>> >>>> incremental updates(which my system supports).
>>> >>
>>> >> Here are the suggestions:
>>> >>
>>> >> I can fetch all attachments for a page. But the attachment data
>>> >> returned doesn't include the current version as a field. I have to
>>> >> split the download url(which is sub-optimal; it includes the current
>>> >> version as a parameter). It might be nice to have an
>>> AttachmentSummary
>>> >> type record.
>>> >>
>>> >> What if uploads an attachment, then a new version of the attachment,
>>> >> then changes the page, then deletes the attachment? How could I fetch
>>> >> that information? I don't see a way to fetch all attachments for all
>>> >> time against a particular history. This is also a problem for deleted
>>> >> pages, comments, and labels(probably others).
>>> >>
>>> >> Comments in confluence support editting. Is this history stored, and
>>> >> if so, can I get access to it?
>>> >>
>>> >> Are labels versioned?
>>> >>
>>> >> Children of pages are versioned, only because pages themselves are
>>> >> versioned, which includes the value of the parentId at the time the
>>> >> page was changed. However, the frontend doesn't let you see older
>>> >> children, when looking at previous versions.
>>> >>
>>> >> It'd be nice if when calling getPageHistory, I could request a subset
>>> >> of the list, instead of *all* page versions. If a page has 271
>>> >> versions, and I have already fetched them, and the current page has a
>>> >> version of 274, then I only really need to fetch 3 PageHistorySummary
>>> >> records(to get the versionComment from newer versions of confluence).
>>> >>
>>> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I
>>> >> can't fetch old versions of blogs.
>>> >>
>>> >>
>>> >>>>
>>> >>>> As a side note, there is a severe lack of version comments. This
>>> >>>> script ends up producing 3117 commits. Some of those are page
>>> >>>> renames/comments/attachments, which don't have a commit message.
>>> Most
>>> >>>> are page commits. There are only 70 change messages. It'd be
>>> nice if
>>> >>>> people would comment when they change a page, but I don't see a
>>> way to
>>> >>>> enforce that.
>>> >>>>
>>> >>>
>>> >
>>> > Adam,
>>> >
>>> > There is currently a beginning effort to create a CMS for apache.org
>>> > (infrastructure/trunk/projects/cms) is yours related to this effort?
>>>
>>> No, it's not. Based on how much time I've spent already(started my
>>> imoprter last friday), and how familiar I am with ofbiz, it'd
>>> probably take me a 2 months to get mostly feature compatible with
>>> confluence(that's for a single person working in his spare time).
>>>
>>> > Jacques
>>> > PS: Not sure how to access to infrastructure/trunk/projects/cms/README
>>> > with the rights I have
>>>
>>> You mean it's not public?
>>>
>>
>>
>>
>

Joe Schaefer

Re: ofbiz wiki(confluence)

Sounds interesting, but for us we require static exports.
Since you're using flat files that might not be all that
hard for you to implement.

Confluence as a CMS has an interesting future ahead of it
at the ASF. Right now we have a hard dependency on the
auto-export plugin, whose support characteristics prevent
us from running the latest versions of confluence. If
the situation doesn't change over the next few months,
we'll likely just phase out the CMS aspects of confluence
and replace it with something that natively supports
static exports.

----- Original Message ----

> From: Adam Heath <[hidden email]>
> To: Jacques Le Roux <[hidden email]>
> Cc: Joe Schaefer <[hidden email]>; [hidden email];
>[hidden email]
> Sent: Tue, September 21, 2010 5:34:10 PM
> Subject: Re: ofbiz wiki(confluence)
>
> On 09/21/2010 03:53 PM, Jacques Le Roux wrote:
> > Thanks Joe,
> >
> > I quickly tried through Subclipse and got an error.
> > I guess now Adam has a better idea of what I was talking about.
> > I mean maybe Webslinger could be used, just my 2 cts...
>
> I will attempt to describe webslinger for those who haven't ever heard of it
>before.
>
> The major features(bullet points) of webslinger-core are:
>
> * Content data stored as raw files. This is to allow normal programs, like
>grep, find, vim, dreamweaver, photoshop, git, svn work without modifications.
>
> * Makes use of commons-vfs, and a custom set of layered filesystems.
>
> * One layered filesystem is called 'flat'. Arbitrary attributes
>(FileObject.getContent().getAttribute(name)) are stored as path/to/file@, into
>separate files. Again, this allows for easy integration with other systems.
>
> * Another layered filesystem is called 'wsvfs'. This is an overlay/cow type
>filesystem, where multiple real filesystems are combined on the fly, giving
>merged directory listings, with support for up-copy and whiteout. Any point of
>the tree can 'overlay' any other part of the tree, altho this feature isn't
>normally nescessary.
>
> * Automatic extension resolution. This allows for pretty urls that don't have
>extensions, and allow the implementation on the server to be changed as
>nescessary. End-users have problems with extensions, so that is hidden.
>
> * Any 'path' can be configured to do it's own sub-path management. This allows
>for nice urls like /shop/product/$productId/detail and
>/shop/cart/add/$productId and /Login/Path/To/Protected/Page. These urls then
>show up nicely in hit reports. They are also easier for end-users to remember.
>
> * Automatic attribute inheritance. Extensions are used to find the mime-type
>of a file. Or the mime-type can be set directly on the file. Then, any
>attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are inherited
>for the resource in question. This allows mapping all ${page}.cf to
>application/x-server-side-confluence, creating an attribute called 'type' with
>a value of 'confluence-page'. More on the types in a bit.
>
> * Every resource has a type, and a handler. Standard types are jsp, cgi,
>binary. Base types are event(bsf-based), code, template. Type can also be
>servlet, or, even more advanced(but not ready to be released) is 'vaadin' as a
>type.
>
> * Several languages are integrated: template: freemarker/velocity/text,
>bsf+code: groovy/janino(java)/jython/rhino/bsh/quercus(php).
>
> * Macros called by a template language can be implemented in *any* webslinger
>resource(any type, any language). Each integrated template type has proxies
>implemented that allow it to call back into webslinger macros. velocity->
>#Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker-> <@Merge
>path="/path/to-file" template0="/template/to/wrap/it/with"/>. Support for
>macros with content bodies is fully supported as well.
>
> * Support for one-type 'wrapper' of a text output, and then different page
>styles. Partial-ajax page updates can then skip this, and do smart updates of
>regions of the browser.
>
> The above list is an non-inclusive list of features in webslinger-core. It's
>really generic, and not tied to any particular implementation.
>
> The other major thing different about it, is that webslinger is *itself* a
>servlet container, just like catalina or glashfish. However, what sets it apart
>from all others, is that it doesn't run standalone; instead, it is installed
>into a parent container. It then fakes/wraps everything, to support it's fancy
>stuff. It supports running standard servlets, but then get backed by
>commons-vfs, with overlay support, etc. This implementation isn't perfect, and
>really needs to be improved upon.
>
> I've been working on a demo for the ofbiz community to play with. However, the
>existing embedded site in the repository was rather small, so I wrote an
>importer to pull stuff from cwiki, which is what then started this thread.
>
> ps: the license on all our code is asl 2.0
>
> >
> > Jacques
> >
> > From: "Joe Schaefer" <[hidden email]>
> >> The url is here
> >> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
> >> and is publicly readable.
> >>
> >>
> >>
> >> ----- Original Message ----
> >>> From: Adam Heath <[hidden email]>
> >>> To: Jacques Le Roux <[hidden email]>
> >>> Cc: [hidden email]; [hidden email]
> >>> Sent: Tue, September 21, 2010 3:42:35 PM
> >>> Subject: Re: ofbiz wiki(confluence)
> >>>
> >>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
> >>> > From: "Adam Heath" <[hidden email]>
> >>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
> >>> >>> From: "Adam Heath" <[hidden email]>
> >>> >>>> So, I need some admin help with cwiki.apache.org, or at least
> >>> advice.
> >>> >>>> I've got a script that uses xmlrpc to confluence, and fetches all
> >>> >>>> previous page(+versions), comments, attachments(+versions), tracks
> >>> >>>> renames, usernames, and commit messages. I then take all this
> >>> data and
> >>> >>>> convert it into a long series of git commits, with the files
> >>> layed out
> >>> >>>> in a proper webslinger design. The author of each git commit is the
> >>> >>>> person who changed the page, added a comment, or uploaded a new
> >>> >>>> attachment.
> >>> >>
> >>> >> This webslinger layout is still in flux, as is my script. The basic
> >>> >> logic works, however, by fetching all meta data, storing most of the
> >>> >> bulk of that in a temporary cache folder(only for the duration of the
> >>> >> script), then sorting each item by date, and replaying the set of
> >>> >> changes one by one.
> >>> >>
> >>> >> It's optimized by storing the 'lastFoo' stuff for each
> >>> >> page/comment/attachment/(title->pageId mapping) as needed, so that it
> >>> >> can detect newer versions, etc, and not have to do anything. A
> >>> refresh
> >>> >> after a full download against the OFBIZ space takes 2 minutes, with
> >>> >> nothing new to fetch.
> >>> >>
> >>> >>>> The issue I am having is the confluence installed on cwiki is old.
> >>> >>>> Newer versions support returning the
> >>> PageHistorySummary.versionComment
> >>> >>>> thru the rpc; currently, I have to fall back and do a screen
> >>> scrape of
> >>> >>>> the viewpreviousversions.action page.
> >>> >>
> >>> >> CONFDEV docs definately list a versionComment field on
> >>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
> >>> >>
> >>> >>>> Where should I ask for help on this, getting this new api
> >>> implemented?
> >>> >>>
> >>> >>> infra team: [hidden email]
> >>> >>>
> >>> >>> I put them in copy
> >>> >>
> >>> >> Thanks. I'm putting more information in this email; I've left
> >>> >> dev@ofbiz on the cc for this email, as others might be interested in
> >>> >> what I have discovered.
> >>> >>
> >>> >>>> I also have suggestions to make the api more lightweight, when
> >>> doing
> >>> >>>> incremental updates(which my system supports).
> >>> >>
> >>> >> Here are the suggestions:
> >>> >>
> >>> >> I can fetch all attachments for a page. But the attachment data
> >>> >> returned doesn't include the current version as a field. I have to
> >>> >> split the download url(which is sub-optimal; it includes the current
> >>> >> version as a parameter). It might be nice to have an
> >>> AttachmentSummary
> >>> >> type record.
> >>> >>
> >>> >> What if uploads an attachment, then a new version of the attachment,
> >>> >> then changes the page, then deletes the attachment? How could I fetch
> >>> >> that information? I don't see a way to fetch all attachments for all
> >>> >> time against a particular history. This is also a problem for deleted
> >>> >> pages, comments, and labels(probably others).
> >>> >>
> >>> >> Comments in confluence support editting. Is this history stored, and
> >>> >> if so, can I get access to it?
> >>> >>
> >>> >> Are labels versioned?
> >>> >>
> >>> >> Children of pages are versioned, only because pages themselves are
> >>> >> versioned, which includes the value of the parentId at the time the
> >>> >> page was changed. However, the frontend doesn't let you see older
> >>> >> children, when looking at previous versions.
> >>> >>
> >>> >> It'd be nice if when calling getPageHistory, I could request a subset
> >>> >> of the list, instead of *all* page versions. If a page has 271
> >>> >> versions, and I have already fetched them, and the current page has a
> >>> >> version of 274, then I only really need to fetch 3 PageHistorySummary
> >>> >> records(to get the versionComment from newer versions of confluence).
> >>> >>
> >>> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I
> >>> >> can't fetch old versions of blogs.
> >>> >>
> >>> >>
> >>> >>>>
> >>> >>>> As a side note, there is a severe lack of version comments. This
> >>> >>>> script ends up producing 3117 commits. Some of those are page
> >>> >>>> renames/comments/attachments, which don't have a commit message.
> >>> Most
> >>> >>>> are page commits. There are only 70 change messages. It'd be
> >>> nice if
> >>> >>>> people would comment when they change a page, but I don't see a
> >>> way to
> >>> >>>> enforce that.
> >>> >>>>
> >>> >>>
> >>> >
> >>> > Adam,
> >>> >
> >>> > There is currently a beginning effort to create a CMS for apache.org
> >>> > (infrastructure/trunk/projects/cms) is yours related to this effort?
> >>>
> >>> No, it's not. Based on how much time I've spent already(started my
> >>> imoprter last friday), and how familiar I am with ofbiz, it'd
> >>> probably take me a 2 months to get mostly feature compatible with
> >>> confluence(that's for a single person working in his spare time).
> >>>
> >>> > Jacques
> >>> > PS: Not sure how to access to infrastructure/trunk/projects/cms/README
> >>> > with the rights I have
> >>>
> >>> You mean it's not public?
> >>>
> >>
> >>
> >>
> >
>
>

Adam Heath-2

Re: ofbiz wiki(confluence)

On 09/21/2010 04:41 PM, Joe Schaefer wrote:
> Sounds interesting, but for us we require static exports.
> Since you're using flat files that might not be all that
> hard for you to implement.

Why? What kind of load do you have? We(brainfood) have survied
slashdotting, without resorting to fancy frontends like varnish. It's
been written to be nonblocking(no synchronized keywords), use
weak/soft references, and not create sessions until absolutely nescessary.

> Confluence as a CMS has an interesting future ahead of it
> at the ASF. Right now we have a hard dependency on the
> auto-export plugin, whose support characteristics prevent
> us from running the latest versions of confluence. If
> the situation doesn't change over the next few months,
> we'll likely just phase out the CMS aspects of confluence
> and replace it with something that natively supports
> static exports.

We would like to support static exports too, and it might even be
possible, with little effort. However, it's just not been nescessary
for us, as we've never had a problem with any kind of load whatsoever.

> ----- Original Message ----
>> From: Adam Heath<[hidden email]>
>> To: Jacques Le Roux<[hidden email]>
>> Cc: Joe Schaefer<[hidden email]>; [hidden email];
>> [hidden email]
>> Sent: Tue, September 21, 2010 5:34:10 PM
>> Subject: Re: ofbiz wiki(confluence)
>>
>> On 09/21/2010 03:53 PM, Jacques Le Roux wrote:
>>> Thanks Joe,
>>>
>>> I quickly tried through Subclipse and got an error.
>>> I guess now Adam has a better idea of what I was talking about.
>>> I mean maybe Webslinger could be used, just my 2 cts...
>>
>> I will attempt to describe webslinger for those who haven't ever heard of it
>> before.
>>
>> The major features(bullet points) of webslinger-core are:
>>
>> * Content data stored as raw files. This is to allow normal programs, like
>> grep, find, vim, dreamweaver, photoshop, git, svn work without modifications.
>>
>> * Makes use of commons-vfs, and a custom set of layered filesystems.
>>
>> * One layered filesystem is called 'flat'. Arbitrary attributes
>> (FileObject.getContent().getAttribute(name)) are stored as path/to/file@, into
>> separate files. Again, this allows for easy integration with other systems.
>>
>> * Another layered filesystem is called 'wsvfs'. This is an overlay/cow type
>> filesystem, where multiple real filesystems are combined on the fly, giving
>> merged directory listings, with support for up-copy and whiteout. Any point of
>> the tree can 'overlay' any other part of the tree, altho this feature isn't
>> normally nescessary.
>>
>> * Automatic extension resolution. This allows for pretty urls that don't have
>> extensions, and allow the implementation on the server to be changed as
>> nescessary. End-users have problems with extensions, so that is hidden.
>>
>> * Any 'path' can be configured to do it's own sub-path management. This allows
>> for nice urls like /shop/product/$productId/detail and
>> /shop/cart/add/$productId and /Login/Path/To/Protected/Page. These urls then
>> show up nicely in hit reports. They are also easier for end-users to remember.
>>
>> * Automatic attribute inheritance. Extensions are used to find the mime-type
>> of a file. Or the mime-type can be set directly on the file. Then, any
>> attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are inherited
>> for the resource in question. This allows mapping all ${page}.cf to
>> application/x-server-side-confluence, creating an attribute called 'type' with
>> a value of 'confluence-page'. More on the types in a bit.
>>
>> * Every resource has a type, and a handler. Standard types are jsp, cgi,
>> binary. Base types are event(bsf-based), code, template. Type can also be
>> servlet, or, even more advanced(but not ready to be released) is 'vaadin' as a
>> type.
>>
>> * Several languages are integrated: template: freemarker/velocity/text,
>> bsf+code: groovy/janino(java)/jython/rhino/bsh/quercus(php).
>>
>> * Macros called by a template language can be implemented in *any* webslinger
>> resource(any type, any language). Each integrated template type has proxies
>> implemented that allow it to call back into webslinger macros. velocity->
>> #Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker-> <@Merge
>> path="/path/to-file" template0="/template/to/wrap/it/with"/>. Support for
>> macros with content bodies is fully supported as well.
>>
>> * Support for one-type 'wrapper' of a text output, and then different page
>> styles. Partial-ajax page updates can then skip this, and do smart updates of
>> regions of the browser.
>>
>> The above list is an non-inclusive list of features in webslinger-core. It's
>> really generic, and not tied to any particular implementation.
>>
>> The other major thing different about it, is that webslinger is *itself* a
>> servlet container, just like catalina or glashfish. However, what sets it apart
>>from all others, is that it doesn't run standalone; instead, it is installed
>> into a parent container. It then fakes/wraps everything, to support it's fancy
>> stuff. It supports running standard servlets, but then get backed by
>> commons-vfs, with overlay support, etc. This implementation isn't perfect, and
>> really needs to be improved upon.
>>
>> I've been working on a demo for the ofbiz community to play with. However, the
>> existing embedded site in the repository was rather small, so I wrote an
>> importer to pull stuff from cwiki, which is what then started this thread.
>>
>> ps: the license on all our code is asl 2.0
>>
>>>
>>> Jacques
>>>
>>> From: "Joe Schaefer"<[hidden email]>
>>>> The url is here
>>>> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
>>>> and is publicly readable.
>>>>
>>>>
>>>>
>>>> ----- Original Message ----
>>>>> From: Adam Heath<[hidden email]>
>>>>> To: Jacques Le Roux<[hidden email]>
>>>>> Cc: [hidden email]; [hidden email]
>>>>> Sent: Tue, September 21, 2010 3:42:35 PM
>>>>> Subject: Re: ofbiz wiki(confluence)
>>>>>
>>>>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
>>>>>> From: "Adam Heath"<[hidden email]>
>>>>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
>>>>> >>> From: "Adam Heath"<[hidden email]>
>>>>> >>>> So, I need some admin help with cwiki.apache.org, or at least
>>>>> advice.
>>>>>>>>> I've got a script that uses xmlrpc to confluence, and fetches all
>>>>>>>>> previous page(+versions), comments, attachments(+versions), tracks
>>>>> >>>> renames, usernames, and commit messages. I then take all this
>>>>> data and
>>>>>>>>> convert it into a long series of git commits, with the files
>>>>> layed out
>>>>>>>>> in a proper webslinger design. The author of each git commit is the
>>>>>>>>> person who changed the page, added a comment, or uploaded a new
>>>>>>>>> attachment.
>>>>>>>
>>>>>>> This webslinger layout is still in flux, as is my script. The basic
>>>>>>> logic works, however, by fetching all meta data, storing most of the
>>>>>>> bulk of that in a temporary cache folder(only for the duration of the
>>>>>>> script), then sorting each item by date, and replaying the set of
>>>>>>> changes one by one.
>>>>>>>
>>>>>>> It's optimized by storing the 'lastFoo' stuff for each
>>>>>>> page/comment/attachment/(title->pageId mapping) as needed, so that it
>>>>>>> can detect newer versions, etc, and not have to do anything. A
>>>>> refresh
>>>>>>> after a full download against the OFBIZ space takes 2 minutes, with
>>>>>>> nothing new to fetch.
>>>>>>>
>>>>>>>>> The issue I am having is the confluence installed on cwiki is old.
>>>>>>>>> Newer versions support returning the
>>>>> PageHistorySummary.versionComment
>>>>> >>>> thru the rpc; currently, I have to fall back and do a screen
>>>>> scrape of
>>>>>>>>> the viewpreviousversions.action page.
>>>>>>>
>>>>> >> CONFDEV docs definately list a versionComment field on
>>>>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
>>>>>>>
>>>>>>>>> Where should I ask for help on this, getting this new api
>>>>> implemented?
>>>>>>>>
>>>>>>>> infra team: [hidden email]
>>>>> >>>
>>>>>>>> I put them in copy
>>>>> >>
>>>>>>> Thanks. I'm putting more information in this email; I've left
>>>>>>> dev@ofbiz on the cc for this email, as others might be interested in
>>>>>>> what I have discovered.
>>>>>>>
>>>>>>>>> I also have suggestions to make the api more lightweight, when
>>>>> doing
>>>>>>>>> incremental updates(which my system supports).
>>>>>>>
>>>>>>> Here are the suggestions:
>>>>>>>
>>>>>>> I can fetch all attachments for a page. But the attachment data
>>>>>>> returned doesn't include the current version as a field. I have to
>>>>>>> split the download url(which is sub-optimal; it includes the current
>>>>>>> version as a parameter). It might be nice to have an
>>>>> AttachmentSummary
>>>>>>> type record.
>>>>>>>
>>>>>>> What if uploads an attachment, then a new version of the attachment,
>>>>>>> then changes the page, then deletes the attachment? How could I fetch
>>>>>>> that information? I don't see a way to fetch all attachments for all
>>>>>>> time against a particular history. This is also a problem for deleted
>>>>>>> pages, comments, and labels(probably others).
>>>>>>>
>>>>>>> Comments in confluence support editting. Is this history stored, and
>>>>>>> if so, can I get access to it?
>>>>> >>
>>>>>>> Are labels versioned?
>>>>> >>
>>>>>>> Children of pages are versioned, only because pages themselves are
>>>>>>> versioned, which includes the value of the parentId at the time the
>>>>>>> page was changed. However, the frontend doesn't let you see older
>>>>>>> children, when looking at previous versions.
>>>>> >>
>>>>>>> It'd be nice if when calling getPageHistory, I could request a subset
>>>>>>> of the list, instead of *all* page versions. If a page has 271
>>>>>>> versions, and I have already fetched them, and the current page has a
>>>>>>> version of 274, then I only really need to fetch 3 PageHistorySummary
>>>>>>> records(to get the versionComment from newer versions of confluence).
>>>>>>>
>>>>> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I
>>>>>>> can't fetch old versions of blogs.
>>>>> >>
>>>>>>>
>>>>> >>>>
>>>>>>>>> As a side note, there is a severe lack of version comments. This
>>>>>>>>> script ends up producing 3117 commits. Some of those are page
>>>>> >>>> renames/comments/attachments, which don't have a commit message.
>>>>> Most
>>>>>>>>> are page commits. There are only 70 change messages. It'd be
>>>>> nice if
>>>>>>>>> people would comment when they change a page, but I don't see a
>>>>> way to
>>>>>>>>> enforce that.
>>>>>>>>>
>>>>> >>>
>>>>>>
>>>>>> Adam,
>>>>> >
>>>>>> There is currently a beginning effort to create a CMS for apache.org
>>>>> > (infrastructure/trunk/projects/cms) is yours related to this effort?
>>>>>
>>>>> No, it's not. Based on how much time I've spent already(started my
>>>>> imoprter last friday), and how familiar I am with ofbiz, it'd
>>>>> probably take me a 2 months to get mostly feature compatible with
>>>>> confluence(that's for a single person working in his spare time).
>>>>>
>>>>>> Jacques
>>>>>> PS: Not sure how to access to infrastructure/trunk/projects/cms/README
>>>>>> with the rights I have
>>>>>
>>>>> You mean it's not public?
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>
>
>

Joe Schaefer

Re: ofbiz wiki(confluence)

About 10M hits a day. No java app hosted on
a single machine would survive for 5 minutes
with our load.

----- Original Message ----

> From: Adam Heath <[hidden email]>
> To: Joe Schaefer <[hidden email]>
> Cc: Jacques Le Roux <[hidden email]>; [hidden email];
>[hidden email]
> Sent: Tue, September 21, 2010 5:48:26 PM
> Subject: Re: ofbiz wiki(confluence)
>
> On 09/21/2010 04:41 PM, Joe Schaefer wrote:
> > Sounds interesting, but for us we require static exports.
> > Since you're using flat files that might not be all that
> > hard for you to implement.
>
> Why? What kind of load do you have? We(brainfood) have survied
> slashdotting, without resorting to fancy frontends like varnish. It's
> been written to be nonblocking(no synchronized keywords), use
> weak/soft references, and not create sessions until absolutely nescessary.
>
> > Confluence as a CMS has an interesting future ahead of it
> > at the ASF. Right now we have a hard dependency on the
> > auto-export plugin, whose support characteristics prevent
> > us from running the latest versions of confluence. If
> > the situation doesn't change over the next few months,
> > we'll likely just phase out the CMS aspects of confluence
> > and replace it with something that natively supports
> > static exports.
>
> We would like to support static exports too, and it might even be
> possible, with little effort. However, it's just not been nescessary
> for us, as we've never had a problem with any kind of load whatsoever.
>
> > ----- Original Message ----
> >> From: Adam Heath<[hidden email]>
> >> To: Jacques Le Roux<[hidden email]>
> >> Cc: Joe Schaefer<[hidden email]>; [hidden email];
> >> [hidden email]
> >> Sent: Tue, September 21, 2010 5:34:10 PM
> >> Subject: Re: ofbiz wiki(confluence)
> >>
> >> On 09/21/2010 03:53 PM, Jacques Le Roux wrote:
> >>> Thanks Joe,
> >>>
> >>> I quickly tried through Subclipse and got an error.
> >>> I guess now Adam has a better idea of what I was talking about.
> >>> I mean maybe Webslinger could be used, just my 2 cts...
> >>
> >> I will attempt to describe webslinger for those who haven't ever heard of
>it
> >> before.
> >>
> >> The major features(bullet points) of webslinger-core are:
> >>
> >> * Content data stored as raw files. This is to allow normal programs,
>like
> >> grep, find, vim, dreamweaver, photoshop, git, svn work without
>modifications.
> >>
> >> * Makes use of commons-vfs, and a custom set of layered filesystems.
> >>
> >> * One layered filesystem is called 'flat'. Arbitrary attributes
> >> (FileObject.getContent().getAttribute(name)) are stored as path/to/file@,
>into
> >> separate files. Again, this allows for easy integration with other
>systems.
> >>
> >> * Another layered filesystem is called 'wsvfs'. This is an overlay/cow
>type
> >> filesystem, where multiple real filesystems are combined on the fly,
>giving
> >> merged directory listings, with support for up-copy and whiteout. Any
>point of
> >> the tree can 'overlay' any other part of the tree, altho this feature
>isn't
> >> normally nescessary.
> >>
> >> * Automatic extension resolution. This allows for pretty urls that don't
>have
> >> extensions, and allow the implementation on the server to be changed as
> >> nescessary. End-users have problems with extensions, so that is hidden.
> >>
> >> * Any 'path' can be configured to do it's own sub-path management. This
>allows
> >> for nice urls like /shop/product/$productId/detail and
> >> /shop/cart/add/$productId and /Login/Path/To/Protected/Page. These urls
>then
> >> show up nicely in hit reports. They are also easier for end-users to
>remember.
> >>
> >> * Automatic attribute inheritance. Extensions are used to find the
>mime-type
> >> of a file. Or the mime-type can be set directly on the file. Then, any
> >> attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are
>inherited
> >> for the resource in question. This allows mapping all ${page}.cf to
> >> application/x-server-side-confluence, creating an attribute called 'type'
>with
> >> a value of 'confluence-page'. More on the types in a bit.
> >>
> >> * Every resource has a type, and a handler. Standard types are jsp,

cgi,

> >> binary. Base types are event(bsf-based), code, template. Type can also
>be
> >> servlet, or, even more advanced(but not ready to be released) is 'vaadin'
>as a
> >> type.
> >>
> >> * Several languages are integrated: template: freemarker/velocity/text,
> >> bsf+code: groovy/janino(java)/jython/rhino/bsh/quercus(php).
> >>
> >> * Macros called by a template language can be implemented in *any*
>webslinger
> >> resource(any type, any language). Each integrated template type has
>proxies
> >> implemented that allow it to call back into webslinger macros.

velocity->
> >> #Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker->
><@Merge
> >> path="/path/to-file" template0="/template/to/wrap/it/with"/>. Support
for

> >> macros with content bodies is fully supported as well.
> >>
> >> * Support for one-type 'wrapper' of a text output, and then different
>page
> >> styles. Partial-ajax page updates can then skip this, and do smart
>updates of
> >> regions of the browser.
> >>
> >> The above list is an non-inclusive list of features in webslinger-core.
>It's
> >> really generic, and not tied to any particular implementation.
> >>
> >> The other major thing different about it, is that webslinger is *itself*
>a
> >> servlet container, just like catalina or glashfish. However, what sets it
>apart
> >>from all others, is that it doesn't run standalone; instead, it is
>installed
> >> into a parent container. It then fakes/wraps everything, to support it's
>fancy
> >> stuff. It supports running standard servlets, but then get backed by
> >> commons-vfs, with overlay support, etc. This implementation isn't
>perfect, and
> >> really needs to be improved upon.
> >>
> >> I've been working on a demo for the ofbiz community to play with. However,
>the
> >> existing embedded site in the repository was rather small, so I wrote an
> >> importer to pull stuff from cwiki, which is what then started this
>thread.
> >>
> >> ps: the license on all our code is asl 2.0
> >>
> >>>
> >>> Jacques
> >>>
> >>> From: "Joe Schaefer"<[hidden email]>
> >>>> The url is here
> >>>> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
> >>>> and is publicly readable.
> >>>>
> >>>>
> >>>>
> >>>> ----- Original Message ----
> >>>>> From: Adam Heath<[hidden email]>
> >>>>> To: Jacques Le Roux<[hidden email]>
> >>>>> Cc: [hidden email]; [hidden email]
> >>>>> Sent: Tue, September 21, 2010 3:42:35 PM
> >>>>> Subject: Re: ofbiz wiki(confluence)
> >>>>>
> >>>>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
> >>>>>> From: "Adam Heath"<[hidden email]>
> >>>>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
> >>>>> >>> From: "Adam Heath"<[hidden email]>
> >>>>> >>>> So, I need some admin help with cwiki.apache.org, or at least
> >>>>> advice.
> >>>>>>>>> I've got a script that uses xmlrpc to confluence, and fetches all
> >>>>>>>>> previous page(+versions), comments, attachments(+versions), tracks
> >>>>> >>>> renames, usernames, and commit messages. I then take all this
> >>>>> data and
> >>>>>>>>> convert it into a long series of git commits, with the files
> >>>>> layed out
> >>>>>>>>> in a proper webslinger design. The author of each git commit is

the

> >>>>>>>>> person who changed the page, added a comment, or uploaded a new
> >>>>>>>>> attachment.
> >>>>>>>
> >>>>>>> This webslinger layout is still in flux, as is my script. The basic
> >>>>>>> logic works, however, by fetching all meta data, storing most of
>the
> >>>>>>> bulk of that in a temporary cache folder(only for the duration of
>the
> >>>>>>> script), then sorting each item by date, and replaying the set of
> >>>>>>> changes one by one.
> >>>>>>>
> >>>>>>> It's optimized by storing the 'lastFoo' stuff for each
> >>>>>>> page/comment/attachment/(title->pageId mapping) as needed, so that
>it
> >>>>>>> can detect newer versions, etc, and not have to do anything. A
> >>>>> refresh
> >>>>>>> after a full download against the OFBIZ space takes 2 minutes, with
> >>>>>>> nothing new to fetch.
> >>>>>>>
> >>>>>>>>> The issue I am having is the confluence installed on cwiki is
>old.
> >>>>>>>>> Newer versions support returning the
> >>>>> PageHistorySummary.versionComment
> >>>>> >>>> thru the rpc; currently, I have to fall back and do a screen
> >>>>> scrape of
> >>>>>>>>> the viewpreviousversions.action page.
> >>>>>>>
> >>>>> >> CONFDEV docs definately list a versionComment field on
> >>>>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on
>cwiki.
> >>>>>>>
> >>>>>>>>> Where should I ask for help on this, getting this new api
> >>>>> implemented?
> >>>>>>>>
> >>>>>>>> infra team: [hidden email]
> >>>>> >>>
> >>>>>>>> I put them in copy
> >>>>> >>
> >>>>>>> Thanks. I'm putting more information in this email; I've left
> >>>>>>> dev@ofbiz on the cc for this email, as others might be interested in
> >>>>>>> what I have discovered.
> >>>>>>>
> >>>>>>>>> I also have suggestions to make the api more lightweight, when
> >>>>> doing
> >>>>>>>>> incremental updates(which my system supports).
> >>>>>>>
> >>>>>>> Here are the suggestions:
> >>>>>>>
> >>>>>>> I can fetch all attachments for a page. But the attachment data
> >>>>>>> returned doesn't include the current version as a field. I have to
> >>>>>>> split the download url(which is sub-optimal; it includes the

current
> >>>>>>> version as a parameter). It might be nice to have an
> >>>>> AttachmentSummary
> >>>>>>> type record.
> >>>>>>>
> >>>>>>> What if uploads an attachment, then a new version of the
attachment,
> >>>>>>> then changes the page, then deletes the attachment? How could I
>fetch
> >>>>>>> that information? I don't see a way to fetch all attachments for
all
> >>>>>>> time against a particular history. This is also a problem for
deleted

> >>>>>>> pages, comments, and labels(probably others).
> >>>>>>>
> >>>>>>> Comments in confluence support editting. Is this history stored,
>and
> >>>>>>> if so, can I get access to it?
> >>>>> >>
> >>>>>>> Are labels versioned?
> >>>>> >>
> >>>>>>> Children of pages are versioned, only because pages themselves are
> >>>>>>> versioned, which includes the value of the parentId at the time the
> >>>>>>> page was changed. However, the frontend doesn't let you see older
> >>>>>>> children, when looking at previous versions.
> >>>>> >>
> >>>>>>> It'd be nice if when calling getPageHistory, I could request a

subset
> >>>>>>> of the list, instead of *all* page versions. If a page has 271
> >>>>>>> versions, and I have already fetched them, and the current page has
a

> >>>>>>> version of 274, then I only really need to fetch 3
>PageHistorySummary
> >>>>>>> records(to get the versionComment from newer versions of
>confluence).
> >>>>>>>
> >>>>> >> BlogEntrySummary doesn't include version, but BlogEntry does.
>And I
> >>>>>>> can't fetch old versions of blogs.
> >>>>> >>
> >>>>>>>
> >>>>> >>>>
> >>>>>>>>> As a side note, there is a severe lack of version comments. This
> >>>>>>>>> script ends up producing 3117 commits. Some of those are page
> >>>>> >>>> renames/comments/attachments, which don't have a commit
>message.
> >>>>> Most
> >>>>>>>>> are page commits. There are only 70 change messages. It'd be
> >>>>> nice if
> >>>>>>>>> people would comment when they change a page, but I don't see a
> >>>>> way to
> >>>>>>>>> enforce that.
> >>>>>>>>>
> >>>>> >>>
> >>>>>>
> >>>>>> Adam,
> >>>>> >
> >>>>>> There is currently a beginning effort to create a CMS for apache.org
> >>>>> > (infrastructure/trunk/projects/cms) is yours related to this
>effort?
> >>>>>
> >>>>> No, it's not. Based on how much time I've spent already(started my
> >>>>> imoprter last friday), and how familiar I am with ofbiz, it'd
> >>>>> probably take me a 2 months to get mostly feature compatible with
> >>>>> confluence(that's for a single person working in his spare time).
> >>>>>
> >>>>>> Jacques
> >>>>>> PS: Not sure how to access to

infrastructure/trunk/projects/cms/README

> >>>>>> with the rights I have
> >>>>>
> >>>>> You mean it's not public?
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
> >
> >
>
>

Adam Heath-2

Re: ofbiz wiki(confluence)

On 09/21/2010 04:50 PM, Joe Schaefer wrote:
> About 10M hits a day. No java app hosted on
> a single machine would survive for 5 minutes
> with our load.

Are those just page requests(html), or everything(images+css+other files)?

Joe Schaefer

Re: ofbiz wiki(confluence)

I don't recall the breakdown, the 10M figure
counts total daily traffic. You could look
at Vadim's stats for more details- the bottom
line is that no app that doesn't support static
exports will function as a suitable CMS for
Apache, which is why we're rolling our own.

----- Original Message ----

> From: Adam Heath <[hidden email]>
> To: Joe Schaefer <[hidden email]>
> Cc: Jacques Le Roux <[hidden email]>; [hidden email];
>[hidden email]
> Sent: Tue, September 21, 2010 6:11:53 PM
> Subject: Re: ofbiz wiki(confluence)
>
> On 09/21/2010 04:50 PM, Joe Schaefer wrote:
> > About 10M hits a day. No java app hosted on
> > a single machine would survive for 5 minutes
> > with our load.
>
> Are those just page requests(html), or everything(images+css+other files)?
>

Adam Heath-2

Re: ofbiz wiki(confluence)

On 09/21/2010 05:15 PM, Joe Schaefer wrote:
> I don't recall the breakdown, the 10M figure
> counts total daily traffic. You could look
> at Vadim's stats for more details- the bottom
> line is that no app that doesn't support static
> exports will function as a suitable CMS for
> Apache, which is why we're rolling our own.

Ok, I've looked. 10M seems to be *all* requests. That number doesn't
scare me. I've got a site live right now, http://www.hailmerry.com/,
that is reporting 190req/s with ab(from apache http) while on
localhost. That's without any fancy supercache sitting in front.
This site supports online live editting of content. Anonymous users
have no session, and we make a point of reducing database access for
hot-points.

This is running on a single large shared iscsi disk host. The cpu
node is an 8-way 2.33Ghz cpu, using xen, with the domU given 512M and
a single cpu. The hosting framework is no where near what could be
consider super-fast.

The main feature, that I haven't mentioned before now, that helps with
this, is a thing we have designed called TTLObject. It is designed to
protect method calls, by saving their results, and returning old
values for a certain amount of time. It is non-blocking, uses a state
engine internally. It was overhauled to follow the design patterns in
Java Concurrency in Practice.

Again, I'm not afraid of those numbers. Our cms stuff(which isn't
quite up to the same level of feature integration as confluence),
stores most data as raw files(to allow svn/git tracking of history).
The database is not used for most functions.

Adam Heath-2

Re: ofbiz wiki(confluence)

On 09/21/2010 05:45 PM, Adam Heath wrote:

> On 09/21/2010 05:15 PM, Joe Schaefer wrote:
>> I don't recall the breakdown, the 10M figure
>> counts total daily traffic. You could look
>> at Vadim's stats for more details- the bottom
>> line is that no app that doesn't support static
>> exports will function as a suitable CMS for
>> Apache, which is why we're rolling our own.
>
> Ok, I've looked. 10M seems to be *all* requests. That number doesn't
> scare me. I've got a site live right now, http://www.hailmerry.com/,
> that is reporting 190req/s with ab(from apache http) while on localhost.
> That's without any fancy supercache sitting in front. This site supports
> online live editting of content. Anonymous users have no session, and we
> make a point of reducing database access for hot-points.
>
> This is running on a single large shared iscsi disk host. The cpu node
> is an 8-way 2.33Ghz cpu, using xen, with the domU given 512M and a
> single cpu. The hosting framework is no where near what could be
> consider super-fast.
>
> The main feature, that I haven't mentioned before now, that helps with
> this, is a thing we have designed called TTLObject. It is designed to
> protect method calls, by saving their results, and returning old values
> for a certain amount of time. It is non-blocking, uses a state engine
> internally. It was overhauled to follow the design patterns in Java
> Concurrency in Practice.
>
> Again, I'm not afraid of those numbers. Our cms stuff(which isn't quite
> up to the same level of feature integration as confluence), stores most
> data as raw files(to allow svn/git tracking of history). The database is
> not used for most functions.

I'm still trying to understand the distribution of load. If my
understanding is wrong, then please tell me. It'll give me a target
to strive for. I've looked at more graphs at (1), and I see a
breakdown of page views per major sub-site(host).

Our current framework underwent a major rewrite 4 years ago, and it
was first deployed into a production state 3.5 years ago. That
particular site did 3,500,000 total requests the day after it went
live, 500,000 page views. There was no super-cache in front of it.
The version of the software at the time had a bug, where
If-Modified-Since processing didn't work, so all images were always
fetched by clients. Our filesystem code has been rewritten, to be
non-blocking(no synchronized keywords), the cow/overlay feature has
had a second-order of speedups, plus other speed fixes. If I
'flatten' the cow/overlay system, so it's not used, the system easily
approached 1000req/s(single page at a time).

So, I'm still not afraid of these numbers.

Now, if you were to combine *all* these hosts into one, and then try
to run it, we might have an issue. But, that's not what is currently
happening.

And, with the 10M number you originally gave, that is 115req/s. If
that is for a single page, then our current software handles that
fine. If that is for 115 different pages at the same time, then I
will have to get back to you, to try that test. I don't have a
program that can request 115 different pages at once.

Also, if the problem is with remote clients tying up a
thread/connection slot, then that is a separate problem from the
backend system. The backend system should have a small thread pool,
so that it can run fast, and then the frontend(either catalina itself,
or apache/mod-jk) does the send using non-blocking-io.

1: http://people.apache.org/~vgritsenko/stats/index.html

Mark Thomas

Re: ofbiz wiki(confluence)

On 22/09/2010 09:09, Adam Heath wrote:
> Now, if you were to combine *all* these hosts into one, and then try to
> run it, we might have an issue. But, that's not what is currently
> happening.

Not sure what you mean here. www.a.o and every tlp.a.o site are served
from a single httpd instance (along with a handful of other virtual
hosts). Everything we require the cms to handle is currently handled by
a single httpd instance.

There are actually two machines. One in the EU and one in the US.
Normally we use geo-based load balancing but a single machine has to be
able to handle all of the traffic comfortably so we can do maintenance
on the other.

> And, with the 10M number you originally gave, that is 115req/s. If that
> is for a single page, then our current software handles that fine. If
> that is for 115 different pages at the same time, then I will have to
> get back to you, to try that test. I don't have a program that can
> request 115 different pages at once.

That is across all virtual hosts so they will be different pages.
http://www.apache.org/server-status will give you a snapshot of current
load.

> Also, if the problem is with remote clients tying up a thread/connection
> slot, then that is a separate problem from the backend system. The
> backend system should have a small thread pool, so that it can run fast,
> and then the frontend(either catalina itself, or apache/mod-jk) does the
> send using non-blocking-io.

The problem is that the systems we have tried before can't handle the load.

It sounds like you are using Tomcat under the covers. Whilst you will
maximise throughput when the threadpool size is roughly the same as the
number of cores on the machine, the overhead of having a few hundred
processing threads isn't that great. We would always front Tomcat with
httpd so an appropriate mod_proxy/mod_jk/Tomcat connector config can
ensure that we don't have to have one Tomcat thread per current
connection (since with keep-alive connections >> requests).

Mark