ofbiz wiki(confluence)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

ofbiz wiki(confluence)

Adam Heath-2
So, I need some admin help with cwiki.apache.org, or at least advice.
  I've got a script that uses xmlrpc to confluence, and fetches all
previous page(+versions), comments, attachments(+versions), tracks
renames, usernames, and commit messages.  I then take all this data
and convert it into a long series of git commits, with the files layed
out in a proper webslinger design.  The author of each git commit is
the person who changed the page, added a comment, or uploaded a new
attachment.

The issue I am having is the confluence installed on cwiki is old.
Newer versions support returning the PageHistorySummary.versionComment
thru the rpc; currently, I have to fall back and do a screen scrape of
the viewpreviousversions.action page.

Where should I ask for help on this, getting this new api implemented?

I also have suggestions to make the api more lightweight, when doing
incremental updates(which my system supports).

As a side note, there is a severe lack of version comments.  This
script ends up producing 3117 commits.  Some of those are page
renames/comments/attachments, which don't have a commit message.  Most
are page commits.  There are only 70 change messages.  It'd be nice if
people would comment when they change a page, but I don't see a way to
enforce that.
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Jacques Le Roux
Administrator
From: "Adam Heath" <[hidden email]>

> So, I need some admin help with cwiki.apache.org, or at least advice.
>  I've got a script that uses xmlrpc to confluence, and fetches all
> previous page(+versions), comments, attachments(+versions), tracks
> renames, usernames, and commit messages.  I then take all this data
> and convert it into a long series of git commits, with the files layed
> out in a proper webslinger design.  The author of each git commit is
> the person who changed the page, added a comment, or uploaded a new
> attachment.
>
> The issue I am having is the confluence installed on cwiki is old.
> Newer versions support returning the PageHistorySummary.versionComment
> thru the rpc; currently, I have to fall back and do a screen scrape of
> the viewpreviousversions.action page.
>
> Where should I ask for help on this, getting this new api implemented?

infra team: [hidden email]

I put them in copy

Jacques
 

> I also have suggestions to make the api more lightweight, when doing
> incremental updates(which my system supports).
>
> As a side note, there is a severe lack of version comments.  This
> script ends up producing 3117 commits.  Some of those are page
> renames/comments/attachments, which don't have a commit message.  Most
> are page commits.  There are only 70 change messages.  It'd be nice if
> people would comment when they change a page, but I don't see a way to
> enforce that.
>

Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Adam Heath-2
On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
> From: "Adam Heath" <[hidden email]>
>> So, I need some admin help with cwiki.apache.org, or at least advice.
>> I've got a script that uses xmlrpc to confluence, and fetches all
>> previous page(+versions), comments, attachments(+versions), tracks
>> renames, usernames, and commit messages. I then take all this data and
>> convert it into a long series of git commits, with the files layed out
>> in a proper webslinger design. The author of each git commit is the
>> person who changed the page, added a comment, or uploaded a new
>> attachment.

This webslinger layout is still in flux, as is my script.  The basic
logic works, however, by fetching all meta data, storing most of the
bulk of that in a temporary cache folder(only for the duration of the
script), then sorting each item by date, and replaying the set of
changes one by one.

It's optimized by storing the 'lastFoo' stuff for each
page/comment/attachment/(title->pageId mapping) as needed, so that it
can detect newer versions, etc, and not have to do anything.  A
refresh after a full download against the OFBIZ space takes 2 minutes,
with nothing new to fetch.

>> The issue I am having is the confluence installed on cwiki is old.
>> Newer versions support returning the PageHistorySummary.versionComment
>> thru the rpc; currently, I have to fall back and do a screen scrape of
>> the viewpreviousversions.action page.

CONFDEV docs definately list a versionComment field on
PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.

>> Where should I ask for help on this, getting this new api implemented?
>
> infra team: [hidden email]
>
> I put them in copy

Thanks.  I'm putting more information in this email; I've left
dev@ofbiz on the cc for this email, as others might be interested in
what I have discovered.

>> I also have suggestions to make the api more lightweight, when doing
>> incremental updates(which my system supports).

Here are the suggestions:

I can fetch all attachments for a page.  But the attachment data
returned doesn't include the current version as a field.  I have to
split the download url(which is sub-optimal; it includes the current
version as a parameter).  It might be nice to have an
AttachmentSummary type record.

What if uploads an attachment, then a new version of the attachment,
then changes the page, then deletes the attachment?  How could I fetch
that information?  I don't see a way to fetch all attachments for all
time against a particular history.  This is also a problem for deleted
pages, comments, and labels(probably others).

Comments in confluence support editting.  Is this history stored, and
if so, can I get access to it?

Are labels versioned?

Children of pages are versioned, only because pages themselves are
versioned, which includes the value of the parentId at the time the
page was changed.  However, the frontend doesn't let you see older
children, when looking at previous versions.

It'd be nice if when calling getPageHistory, I could request a subset
of the list, instead of *all* page versions.  If a page has 271
versions, and I have already fetched them, and the current page has a
version of 274, then I only really need to fetch 3 PageHistorySummary
records(to get the versionComment from newer versions of confluence).

BlogEntrySummary doesn't include version, but BlogEntry does.  And I
can't fetch old versions of blogs.


>>
>> As a side note, there is a severe lack of version comments. This
>> script ends up producing 3117 commits. Some of those are page
>> renames/comments/attachments, which don't have a commit message. Most
>> are page commits. There are only 70 change messages. It'd be nice if
>> people would comment when they change a page, but I don't see a way to
>> enforce that.
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Jacques Le Roux
Administrator
From: "Adam Heath" <[hidden email]>

> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
>> From: "Adam Heath" <[hidden email]>
>>> So, I need some admin help with cwiki.apache.org, or at least advice.
>>> I've got a script that uses xmlrpc to confluence, and fetches all
>>> previous page(+versions), comments, attachments(+versions), tracks
>>> renames, usernames, and commit messages. I then take all this data and
>>> convert it into a long series of git commits, with the files layed out
>>> in a proper webslinger design. The author of each git commit is the
>>> person who changed the page, added a comment, or uploaded a new
>>> attachment.
>
> This webslinger layout is still in flux, as is my script.  The basic logic works, however, by fetching all meta data, storing most
> of the bulk of that in a temporary cache folder(only for the duration of the script), then sorting each item by date, and
> replaying the set of changes one by one.
>
> It's optimized by storing the 'lastFoo' stuff for each page/comment/attachment/(title->pageId mapping) as needed, so that it can
> detect newer versions, etc, and not have to do anything.  A refresh after a full download against the OFBIZ space takes 2 minutes,
> with nothing new to fetch.
>
>>> The issue I am having is the confluence installed on cwiki is old.
>>> Newer versions support returning the PageHistorySummary.versionComment
>>> thru the rpc; currently, I have to fall back and do a screen scrape of
>>> the viewpreviousversions.action page.
>
> CONFDEV docs definately list a versionComment field on PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
>
>>> Where should I ask for help on this, getting this new api implemented?
>>
>> infra team: [hidden email]
>>
>> I put them in copy
>
> Thanks.  I'm putting more information in this email; I've left dev@ofbiz on the cc for this email, as others might be interested
> in what I have discovered.
>
>>> I also have suggestions to make the api more lightweight, when doing
>>> incremental updates(which my system supports).
>
> Here are the suggestions:
>
> I can fetch all attachments for a page.  But the attachment data returned doesn't include the current version as a field.  I have
> to split the download url(which is sub-optimal; it includes the current version as a parameter).  It might be nice to have an
> AttachmentSummary type record.
>
> What if uploads an attachment, then a new version of the attachment, then changes the page, then deletes the attachment?  How
> could I fetch that information?  I don't see a way to fetch all attachments for all time against a particular history.  This is
> also a problem for deleted pages, comments, and labels(probably others).
>
> Comments in confluence support editting.  Is this history stored, and if so, can I get access to it?
>
> Are labels versioned?
>
> Children of pages are versioned, only because pages themselves are versioned, which includes the value of the parentId at the time
> the page was changed.  However, the frontend doesn't let you see older children, when looking at previous versions.
>
> It'd be nice if when calling getPageHistory, I could request a subset of the list, instead of *all* page versions.  If a page has
> 271 versions, and I have already fetched them, and the current page has a version of 274, then I only really need to fetch 3
> PageHistorySummary records(to get the versionComment from newer versions of confluence).
>
> BlogEntrySummary doesn't include version, but BlogEntry does.  And I can't fetch old versions of blogs.
>
>
>>>
>>> As a side note, there is a severe lack of version comments. This
>>> script ends up producing 3117 commits. Some of those are page
>>> renames/comments/attachments, which don't have a commit message. Most
>>> are page commits. There are only 70 change messages. It'd be nice if
>>> people would comment when they change a page, but I don't see a way to
>>> enforce that.
>>>
>>

Adam,

There is currently a beginning effort to create a CMS for apache.org (infrastructure/trunk/projects/cms) is yours related to this
effort?

Jacques
PS: Not sure how to access to infrastructure/trunk/projects/cms/README with the rights I have


Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Adam Heath-2
On 09/21/2010 02:07 PM, Jacques Le Roux wrote:

> From: "Adam Heath" <[hidden email]>
>> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
>>> From: "Adam Heath" <[hidden email]>
>>>> So, I need some admin help with cwiki.apache.org, or at least advice.
>>>> I've got a script that uses xmlrpc to confluence, and fetches all
>>>> previous page(+versions), comments, attachments(+versions), tracks
>>>> renames, usernames, and commit messages. I then take all this data and
>>>> convert it into a long series of git commits, with the files layed out
>>>> in a proper webslinger design. The author of each git commit is the
>>>> person who changed the page, added a comment, or uploaded a new
>>>> attachment.
>>
>> This webslinger layout is still in flux, as is my script. The basic
>> logic works, however, by fetching all meta data, storing most of the
>> bulk of that in a temporary cache folder(only for the duration of the
>> script), then sorting each item by date, and replaying the set of
>> changes one by one.
>>
>> It's optimized by storing the 'lastFoo' stuff for each
>> page/comment/attachment/(title->pageId mapping) as needed, so that it
>> can detect newer versions, etc, and not have to do anything. A refresh
>> after a full download against the OFBIZ space takes 2 minutes, with
>> nothing new to fetch.
>>
>>>> The issue I am having is the confluence installed on cwiki is old.
>>>> Newer versions support returning the PageHistorySummary.versionComment
>>>> thru the rpc; currently, I have to fall back and do a screen scrape of
>>>> the viewpreviousversions.action page.
>>
>> CONFDEV docs definately list a versionComment field on
>> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
>>
>>>> Where should I ask for help on this, getting this new api implemented?
>>>
>>> infra team: [hidden email]
>>>
>>> I put them in copy
>>
>> Thanks. I'm putting more information in this email; I've left
>> dev@ofbiz on the cc for this email, as others might be interested in
>> what I have discovered.
>>
>>>> I also have suggestions to make the api more lightweight, when doing
>>>> incremental updates(which my system supports).
>>
>> Here are the suggestions:
>>
>> I can fetch all attachments for a page. But the attachment data
>> returned doesn't include the current version as a field. I have to
>> split the download url(which is sub-optimal; it includes the current
>> version as a parameter). It might be nice to have an AttachmentSummary
>> type record.
>>
>> What if uploads an attachment, then a new version of the attachment,
>> then changes the page, then deletes the attachment? How could I fetch
>> that information? I don't see a way to fetch all attachments for all
>> time against a particular history. This is also a problem for deleted
>> pages, comments, and labels(probably others).
>>
>> Comments in confluence support editting. Is this history stored, and
>> if so, can I get access to it?
>>
>> Are labels versioned?
>>
>> Children of pages are versioned, only because pages themselves are
>> versioned, which includes the value of the parentId at the time the
>> page was changed. However, the frontend doesn't let you see older
>> children, when looking at previous versions.
>>
>> It'd be nice if when calling getPageHistory, I could request a subset
>> of the list, instead of *all* page versions. If a page has 271
>> versions, and I have already fetched them, and the current page has a
>> version of 274, then I only really need to fetch 3 PageHistorySummary
>> records(to get the versionComment from newer versions of confluence).
>>
>> BlogEntrySummary doesn't include version, but BlogEntry does. And I
>> can't fetch old versions of blogs.
>>
>>
>>>>
>>>> As a side note, there is a severe lack of version comments. This
>>>> script ends up producing 3117 commits. Some of those are page
>>>> renames/comments/attachments, which don't have a commit message. Most
>>>> are page commits. There are only 70 change messages. It'd be nice if
>>>> people would comment when they change a page, but I don't see a way to
>>>> enforce that.
>>>>
>>>
>
> Adam,
>
> There is currently a beginning effort to create a CMS for apache.org
> (infrastructure/trunk/projects/cms) is yours related to this effort?

No, it's not.  Based on how much time I've spent already(started my
imoprter last friday), and how familiar I am with ofbiz, it'd probably
take me a 2 months to get mostly feature compatible with
confluence(that's for a single person working in his spare time).

> Jacques
> PS: Not sure how to access to infrastructure/trunk/projects/cms/README
> with the rights I have

You mean it's not public?
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Joe Schaefer
The url is here
https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
and is publicly readable.



----- Original Message ----

> From: Adam Heath <[hidden email]>
> To: Jacques Le Roux <[hidden email]>
> Cc: [hidden email]; [hidden email]
> Sent: Tue, September 21, 2010 3:42:35 PM
> Subject: Re: ofbiz wiki(confluence)
>
> On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
> > From: "Adam Heath" <[hidden email]>
> >> On  09/21/2010 11:53 AM, Jacques Le Roux wrote:
> >>> From: "Adam Heath"  <[hidden email]>
> >>>>  So, I need some admin help with cwiki.apache.org, or at least  advice.
> >>>> I've got a script that uses xmlrpc to confluence,  and fetches all
> >>>> previous page(+versions), comments,  attachments(+versions), tracks
> >>>> renames, usernames, and  commit messages. I then take all this data and
> >>>> convert it  into a long series of git commits, with the files layed out
> >>>>  in a proper webslinger design. The author of each git commit is  the
> >>>> person who changed the page, added a comment, or  uploaded a new
> >>>> attachment.
> >>
> >> This  webslinger layout is still in flux, as is my script. The basic
> >> logic  works, however, by fetching all meta data, storing most of the
> >> bulk  of that in a temporary cache folder(only for the duration of the
> >>  script), then sorting each item by date, and replaying the set of
> >>  changes one by one.
> >>
> >> It's optimized by storing the  'lastFoo' stuff for each
> >> page/comment/attachment/(title->pageId  mapping) as needed, so that it
> >> can detect newer versions, etc, and  not have to do anything. A refresh
> >> after a full download against the  OFBIZ space takes 2 minutes, with
> >> nothing new to  fetch.
> >>
> >>>> The issue I am having is the confluence  installed on cwiki is old.
> >>>> Newer versions support returning  the PageHistorySummary.versionComment
> >>>> thru the rpc;  currently, I have to fall back and do a screen scrape of
> >>>> the  viewpreviousversions.action page.
> >>
> >> CONFDEV docs  definately list a versionComment field on
> >> PageHistorySummary, that  is not exposed in 3.2.0 installed on cwiki.
> >>
> >>>>  Where should I ask for help on this, getting this new api  implemented?
> >>>
> >>> infra team: [hidden email]
> >>>
> >>>  I put them in copy
> >>
> >> Thanks. I'm putting more information  in this email; I've left
> >> dev@ofbiz on the cc for this email, as  others might be interested in
> >> what I have  discovered.
> >>
> >>>> I also have suggestions to make the  api more lightweight, when doing
> >>>> incremental updates(which  my system supports).
> >>
> >> Here are the  suggestions:
> >>
> >> I can fetch all attachments for a page. But  the attachment data
> >> returned doesn't include the current version as  a field. I have to
> >> split the download url(which is sub-optimal; it  includes the current
> >> version as a parameter). It might be nice to  have an AttachmentSummary
> >> type record.
> >>
> >> What  if uploads an attachment, then a new version of the attachment,
> >> then  changes the page, then deletes the attachment? How could I fetch
> >>  that information? I don't see a way to fetch all attachments for all
> >>  time against a particular history. This is also a problem for  deleted
> >> pages, comments, and labels(probably  others).
> >>
> >> Comments in confluence support editting. Is  this history stored, and
> >> if so, can I get access to  it?
> >>
> >> Are labels versioned?
> >>
> >>  Children of pages are versioned, only because pages themselves are
> >>  versioned, which includes the value of the parentId at the time the
> >>  page was changed. However, the frontend doesn't let you see older
> >>  children, when looking at previous versions.
> >>
> >> It'd be  nice if when calling getPageHistory, I could request a subset
> >> of the  list, instead of *all* page versions. If a page has 271
> >> versions,  and I have already fetched them, and the current page has a
> >> version  of 274, then I only really need to fetch 3 PageHistorySummary
> >>  records(to get the versionComment from newer versions of  confluence).
> >>
> >> BlogEntrySummary doesn't include version,  but BlogEntry does. And I
> >> can't fetch old versions of  blogs.
> >>
> >>
> >>>>
> >>>> As a side  note, there is a severe lack of version comments. This
> >>>>  script ends up producing 3117 commits. Some of those are  page
> >>>> renames/comments/attachments, which don't have a commit  message. Most
> >>>> are page commits. There are only 70 change  messages. It'd be nice if
> >>>> people would comment when they  change a page, but I don't see a way to
> >>>> enforce  that.
> >>>>
> >>>
> >
> > Adam,
> >
> >  There is currently a beginning effort to create a CMS for apache.org
> >  (infrastructure/trunk/projects/cms) is yours related to this effort?
>
> No,  it's not.  Based on how much time I've spent already(started my
> imoprter last friday), and how familiar I am with ofbiz, it'd probably
> take me a 2 months to get mostly feature compatible with
> confluence(that's for a single person working in his spare  time).
>
> > Jacques
> > PS: Not sure how to access to  infrastructure/trunk/projects/cms/README
> > with the rights I  have
>
> You mean it's not public?
>


     
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Jacques Le Roux
Administrator
Thanks Joe,

I quickly tried through Subclipse and got an error.
I guess now Adam has a better idea of what I was talking about.
I mean maybe Webslinger could be used, just my 2 cts...

Jacques

From: "Joe Schaefer" <[hidden email]>

> The url is here
> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
> and is publicly readable.
>
>
>
> ----- Original Message ----
>> From: Adam Heath <[hidden email]>
>> To: Jacques Le Roux <[hidden email]>
>> Cc: [hidden email]; [hidden email]
>> Sent: Tue, September 21, 2010 3:42:35 PM
>> Subject: Re: ofbiz wiki(confluence)
>>
>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
>> > From: "Adam Heath" <[hidden email]>
>> >> On  09/21/2010 11:53 AM, Jacques Le Roux wrote:
>> >>> From: "Adam Heath"  <[hidden email]>
>> >>>>  So, I need some admin help with cwiki.apache.org, or at least  advice.
>> >>>> I've got a script that uses xmlrpc to confluence,  and fetches all
>> >>>> previous page(+versions), comments,  attachments(+versions), tracks
>> >>>> renames, usernames, and  commit messages. I then take all this data and
>> >>>> convert it  into a long series of git commits, with the files layed out
>> >>>>  in a proper webslinger design. The author of each git commit is  the
>> >>>> person who changed the page, added a comment, or  uploaded a new
>> >>>> attachment.
>> >>
>> >> This  webslinger layout is still in flux, as is my script. The basic
>> >> logic  works, however, by fetching all meta data, storing most of the
>> >> bulk  of that in a temporary cache folder(only for the duration of the
>> >>  script), then sorting each item by date, and replaying the set of
>> >>  changes one by one.
>> >>
>> >> It's optimized by storing the  'lastFoo' stuff for each
>> >> page/comment/attachment/(title->pageId  mapping) as needed, so that it
>> >> can detect newer versions, etc, and  not have to do anything. A refresh
>> >> after a full download against the  OFBIZ space takes 2 minutes, with
>> >> nothing new to  fetch.
>> >>
>> >>>> The issue I am having is the confluence  installed on cwiki is old.
>> >>>> Newer versions support returning  the PageHistorySummary.versionComment
>> >>>> thru the rpc;  currently, I have to fall back and do a screen scrape of
>> >>>> the  viewpreviousversions.action page.
>> >>
>> >> CONFDEV docs  definately list a versionComment field on
>> >> PageHistorySummary, that  is not exposed in 3.2.0 installed on cwiki.
>> >>
>> >>>>  Where should I ask for help on this, getting this new api  implemented?
>> >>>
>> >>> infra team: [hidden email]
>> >>>
>> >>>  I put them in copy
>> >>
>> >> Thanks. I'm putting more information  in this email; I've left
>> >> dev@ofbiz on the cc for this email, as  others might be interested in
>> >> what I have  discovered.
>> >>
>> >>>> I also have suggestions to make the  api more lightweight, when doing
>> >>>> incremental updates(which  my system supports).
>> >>
>> >> Here are the  suggestions:
>> >>
>> >> I can fetch all attachments for a page. But  the attachment data
>> >> returned doesn't include the current version as  a field. I have to
>> >> split the download url(which is sub-optimal; it  includes the current
>> >> version as a parameter). It might be nice to  have an AttachmentSummary
>> >> type record.
>> >>
>> >> What  if uploads an attachment, then a new version of the attachment,
>> >> then  changes the page, then deletes the attachment? How could I fetch
>> >>  that information? I don't see a way to fetch all attachments for all
>> >>  time against a particular history. This is also a problem for  deleted
>> >> pages, comments, and labels(probably  others).
>> >>
>> >> Comments in confluence support editting. Is  this history stored, and
>> >> if so, can I get access to  it?
>> >>
>> >> Are labels versioned?
>> >>
>> >>  Children of pages are versioned, only because pages themselves are
>> >>  versioned, which includes the value of the parentId at the time the
>> >>  page was changed. However, the frontend doesn't let you see older
>> >>  children, when looking at previous versions.
>> >>
>> >> It'd be  nice if when calling getPageHistory, I could request a subset
>> >> of the  list, instead of *all* page versions. If a page has 271
>> >> versions,  and I have already fetched them, and the current page has a
>> >> version  of 274, then I only really need to fetch 3 PageHistorySummary
>> >>  records(to get the versionComment from newer versions of  confluence).
>> >>
>> >> BlogEntrySummary doesn't include version,  but BlogEntry does. And I
>> >> can't fetch old versions of  blogs.
>> >>
>> >>
>> >>>>
>> >>>> As a side  note, there is a severe lack of version comments. This
>> >>>>  script ends up producing 3117 commits. Some of those are  page
>> >>>> renames/comments/attachments, which don't have a commit  message. Most
>> >>>> are page commits. There are only 70 change  messages. It'd be nice if
>> >>>> people would comment when they  change a page, but I don't see a way to
>> >>>> enforce  that.
>> >>>>
>> >>>
>> >
>> > Adam,
>> >
>> >  There is currently a beginning effort to create a CMS for apache.org
>> >  (infrastructure/trunk/projects/cms) is yours related to this effort?
>>
>> No,  it's not.  Based on how much time I've spent already(started my
>> imoprter last friday), and how familiar I am with ofbiz, it'd probably
>> take me a 2 months to get mostly feature compatible with
>> confluence(that's for a single person working in his spare  time).
>>
>> > Jacques
>> > PS: Not sure how to access to  infrastructure/trunk/projects/cms/README
>> > with the rights I  have
>>
>> You mean it's not public?
>>
>
>
>      
>

Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Adam Heath-2
On 09/21/2010 03:53 PM, Jacques Le Roux wrote:
> Thanks Joe,
>
> I quickly tried through Subclipse and got an error.
> I guess now Adam has a better idea of what I was talking about.
> I mean maybe Webslinger could be used, just my 2 cts...

I will attempt to describe webslinger for those who haven't ever heard
of it before.

The major features(bullet points) of webslinger-core are:

* Content data stored as raw files.  This is to allow normal programs,
like grep, find, vim, dreamweaver, photoshop, git, svn work without
modifications.

* Makes use of commons-vfs, and a custom set of layered filesystems.

* One layered filesystem is called 'flat'.  Arbitrary attributes
(FileObject.getContent().getAttribute(name)) are stored as
path/to/file@, into separate files.  Again, this allows for easy
integration with other systems.

* Another layered filesystem is called 'wsvfs'.  This is an
overlay/cow type filesystem, where multiple real filesystems are
combined on the fly, giving merged directory listings, with support
for up-copy and whiteout.  Any point of the tree can 'overlay' any
other part of the tree, altho this feature isn't normally nescessary.

* Automatic extension resolution.  This allows for pretty urls that
don't have extensions, and allow the implementation on the server to
be changed as nescessary.  End-users have problems with extensions, so
that is hidden.

* Any 'path' can be configured to do it's own sub-path management.
This allows for nice urls like /shop/product/$productId/detail and
/shop/cart/add/$productId and /Login/Path/To/Protected/Page.  These
urls then show up nicely in hit reports.  They are also easier for
end-users to remember.

* Automatic attribute inheritance.  Extensions are used to find the
mime-type of a file.  Or the mime-type can be set directly on the
file.  Then, any attribute files set in
/WEB-INF/DefaultMimeAttributes/$mime/$type are inherited for the
resource in question.  This allows mapping all ${page}.cf to
application/x-server-side-confluence, creating an attribute called
'type' with a value of 'confluence-page'.  More on the types in a bit.

* Every resource has a type, and a handler.  Standard types are jsp,
cgi, binary.  Base types are event(bsf-based), code, template.  Type
can also be servlet, or, even more advanced(but not ready to be
released) is 'vaadin' as a type.

* Several languages are integrated: template:
freemarker/velocity/text, bsf+code:
groovy/janino(java)/jython/rhino/bsh/quercus(php).

* Macros called by a template language can be implemented in *any*
webslinger resource(any type, any language).  Each integrated template
type has proxies implemented that allow it to call back into
webslinger macros.  velocity-> #Merge("/path/to/file",
"/template/to/wrap/it/with"), freemarker-> <@Merge
path="/path/to-file" template0="/template/to/wrap/it/with"/>.  Support
for macros with content bodies is fully supported as well.

* Support for one-type 'wrapper' of a text output, and then different
page styles.  Partial-ajax page updates can then skip this, and do
smart updates of regions of the browser.

The above list is an non-inclusive list of features in
webslinger-core.  It's really generic, and not tied to any particular
implementation.

The other major thing different about it, is that webslinger is
*itself* a servlet container, just like catalina or glashfish.
However, what sets it apart from all others, is that it doesn't run
standalone; instead, it is installed into a parent container.  It then
fakes/wraps everything, to support it's fancy stuff.  It supports
running standard servlets, but then get backed by commons-vfs, with
overlay support, etc.  This implementation isn't perfect, and really
needs to be improved upon.

I've been working on a demo for the ofbiz community to play with.
However, the existing embedded site in the repository was rather
small, so I wrote an importer to pull stuff from cwiki, which is what
then started this thread.

ps: the license on all our code is asl 2.0

>
> Jacques
>
> From: "Joe Schaefer" <[hidden email]>
>> The url is here
>> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
>> and is publicly readable.
>>
>>
>>
>> ----- Original Message ----
>>> From: Adam Heath <[hidden email]>
>>> To: Jacques Le Roux <[hidden email]>
>>> Cc: [hidden email]; [hidden email]
>>> Sent: Tue, September 21, 2010 3:42:35 PM
>>> Subject: Re: ofbiz wiki(confluence)
>>>
>>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
>>> > From: "Adam Heath" <[hidden email]>
>>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
>>> >>> From: "Adam Heath" <[hidden email]>
>>> >>>> So, I need some admin help with cwiki.apache.org, or at least
>>> advice.
>>> >>>> I've got a script that uses xmlrpc to confluence, and fetches all
>>> >>>> previous page(+versions), comments, attachments(+versions), tracks
>>> >>>> renames, usernames, and commit messages. I then take all this
>>> data and
>>> >>>> convert it into a long series of git commits, with the files
>>> layed out
>>> >>>> in a proper webslinger design. The author of each git commit is the
>>> >>>> person who changed the page, added a comment, or uploaded a new
>>> >>>> attachment.
>>> >>
>>> >> This webslinger layout is still in flux, as is my script. The basic
>>> >> logic works, however, by fetching all meta data, storing most of the
>>> >> bulk of that in a temporary cache folder(only for the duration of the
>>> >> script), then sorting each item by date, and replaying the set of
>>> >> changes one by one.
>>> >>
>>> >> It's optimized by storing the 'lastFoo' stuff for each
>>> >> page/comment/attachment/(title->pageId mapping) as needed, so that it
>>> >> can detect newer versions, etc, and not have to do anything. A
>>> refresh
>>> >> after a full download against the OFBIZ space takes 2 minutes, with
>>> >> nothing new to fetch.
>>> >>
>>> >>>> The issue I am having is the confluence installed on cwiki is old.
>>> >>>> Newer versions support returning the
>>> PageHistorySummary.versionComment
>>> >>>> thru the rpc; currently, I have to fall back and do a screen
>>> scrape of
>>> >>>> the viewpreviousversions.action page.
>>> >>
>>> >> CONFDEV docs definately list a versionComment field on
>>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki.
>>> >>
>>> >>>> Where should I ask for help on this, getting this new api
>>> implemented?
>>> >>>
>>> >>> infra team: [hidden email]
>>> >>>
>>> >>> I put them in copy
>>> >>
>>> >> Thanks. I'm putting more information in this email; I've left
>>> >> dev@ofbiz on the cc for this email, as others might be interested in
>>> >> what I have discovered.
>>> >>
>>> >>>> I also have suggestions to make the api more lightweight, when
>>> doing
>>> >>>> incremental updates(which my system supports).
>>> >>
>>> >> Here are the suggestions:
>>> >>
>>> >> I can fetch all attachments for a page. But the attachment data
>>> >> returned doesn't include the current version as a field. I have to
>>> >> split the download url(which is sub-optimal; it includes the current
>>> >> version as a parameter). It might be nice to have an
>>> AttachmentSummary
>>> >> type record.
>>> >>
>>> >> What if uploads an attachment, then a new version of the attachment,
>>> >> then changes the page, then deletes the attachment? How could I fetch
>>> >> that information? I don't see a way to fetch all attachments for all
>>> >> time against a particular history. This is also a problem for deleted
>>> >> pages, comments, and labels(probably others).
>>> >>
>>> >> Comments in confluence support editting. Is this history stored, and
>>> >> if so, can I get access to it?
>>> >>
>>> >> Are labels versioned?
>>> >>
>>> >> Children of pages are versioned, only because pages themselves are
>>> >> versioned, which includes the value of the parentId at the time the
>>> >> page was changed. However, the frontend doesn't let you see older
>>> >> children, when looking at previous versions.
>>> >>
>>> >> It'd be nice if when calling getPageHistory, I could request a subset
>>> >> of the list, instead of *all* page versions. If a page has 271
>>> >> versions, and I have already fetched them, and the current page has a
>>> >> version of 274, then I only really need to fetch 3 PageHistorySummary
>>> >> records(to get the versionComment from newer versions of confluence).
>>> >>
>>> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I
>>> >> can't fetch old versions of blogs.
>>> >>
>>> >>
>>> >>>>
>>> >>>> As a side note, there is a severe lack of version comments. This
>>> >>>> script ends up producing 3117 commits. Some of those are page
>>> >>>> renames/comments/attachments, which don't have a commit message.
>>> Most
>>> >>>> are page commits. There are only 70 change messages. It'd be
>>> nice if
>>> >>>> people would comment when they change a page, but I don't see a
>>> way to
>>> >>>> enforce that.
>>> >>>>
>>> >>>
>>> >
>>> > Adam,
>>> >
>>> > There is currently a beginning effort to create a CMS for apache.org
>>> > (infrastructure/trunk/projects/cms) is yours related to this effort?
>>>
>>> No, it's not. Based on how much time I've spent already(started my
>>> imoprter last friday), and how familiar I am with ofbiz, it'd
>>> probably take me a 2 months to get mostly feature compatible with
>>> confluence(that's for a single person working in his spare time).
>>>
>>> > Jacques
>>> > PS: Not sure how to access to infrastructure/trunk/projects/cms/README
>>> > with the rights I have
>>>
>>> You mean it's not public?
>>>
>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Joe Schaefer
Sounds interesting, but for us we require static exports.
Since you're using flat files that might not be all that
hard for you to implement.

Confluence as a CMS has an interesting future ahead of it
at the ASF.  Right now we have a hard dependency on the
auto-export plugin,  whose support characteristics prevent
us from running the latest versions of confluence.  If
the situation doesn't change over the next few months,
we'll likely just phase out the CMS aspects of confluence
and replace it with something that natively supports
static exports.




----- Original Message ----

> From: Adam Heath <[hidden email]>
> To: Jacques Le Roux <[hidden email]>
> Cc: Joe Schaefer <[hidden email]>; [hidden email];
>[hidden email]
> Sent: Tue, September 21, 2010 5:34:10 PM
> Subject: Re: ofbiz wiki(confluence)
>
> On 09/21/2010 03:53 PM, Jacques Le Roux wrote:
> > Thanks Joe,
> >
> > I quickly tried through Subclipse and got an error.
> > I guess now  Adam has a better idea of what I was talking about.
> > I mean maybe  Webslinger could be used, just my 2 cts...
>
> I will attempt to describe  webslinger for those who haven't ever heard of it
>before.
>
> The major  features(bullet points) of webslinger-core are:
>
> * Content data stored as  raw files.  This is to allow normal programs, like
>grep, find, vim,  dreamweaver, photoshop, git, svn work without modifications.
>
> * Makes use  of commons-vfs, and a custom set of layered filesystems.
>
> * One layered  filesystem is called 'flat'.  Arbitrary attributes  
>(FileObject.getContent().getAttribute(name)) are stored as path/to/file@, into  
>separate files.  Again, this allows for easy integration with other  systems.
>
> * Another layered filesystem is called 'wsvfs'.  This is an  overlay/cow type
>filesystem, where multiple real filesystems are combined on the  fly, giving
>merged directory listings, with support for up-copy and  whiteout.  Any point of
>the tree can 'overlay' any other part of the tree,  altho this feature isn't
>normally nescessary.
>
> * Automatic extension  resolution.  This allows for pretty urls that don't have
>extensions, and  allow the implementation on the server to be changed as
>nescessary.   End-users have problems with extensions, so that is hidden.
>
> * Any 'path'  can be configured to do it's own sub-path management. This allows
>for nice urls  like /shop/product/$productId/detail and
>/shop/cart/add/$productId and  /Login/Path/To/Protected/Page.  These urls then
>show up nicely in hit  reports.  They are also easier for end-users to remember.
>
> *  Automatic attribute inheritance.  Extensions are used to find the mime-type  
>of a file.  Or the mime-type can be set directly on the file.  Then,  any
>attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are  inherited
>for the resource in question.  This allows mapping all ${page}.cf  to
>application/x-server-side-confluence, creating an attribute called 'type'  with
>a value of 'confluence-page'.  More on the types in a bit.
>
> *  Every resource has a type, and a handler.  Standard types are jsp, cgi,  
>binary.  Base types are event(bsf-based), code, template.  Type can  also be
>servlet, or, even more advanced(but not ready to be released) is  'vaadin' as a
>type.
>
> * Several languages are integrated: template:  freemarker/velocity/text,
>bsf+code:  groovy/janino(java)/jython/rhino/bsh/quercus(php).
>
> * Macros called by a  template language can be implemented in *any* webslinger
>resource(any type, any  language).  Each integrated template type has proxies
>implemented that  allow it to call back into webslinger macros.  velocity->  
>#Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker-> <@Merge  
>path="/path/to-file" template0="/template/to/wrap/it/with"/>.  Support  for
>macros with content bodies is fully supported as well.
>
> * Support for  one-type 'wrapper' of a text output, and then different page
>styles.   Partial-ajax page updates can then skip this, and do smart updates of
>regions of  the browser.
>
> The above list is an non-inclusive list of features in  webslinger-core.  It's
>really generic, and not tied to any particular  implementation.
>
> The other major thing different about it, is that  webslinger is *itself* a
>servlet container, just like catalina or glashfish.  However, what sets it apart
>from all others, is that it doesn't run standalone;  instead, it is installed
>into a parent container.  It then fakes/wraps  everything, to support it's fancy
>stuff.  It supports running standard  servlets, but then get backed by
>commons-vfs, with overlay support, etc.   This implementation isn't perfect, and
>really needs to be improved  upon.
>
> I've been working on a demo for the ofbiz community to play with.  However, the
>existing embedded site in the repository was rather small, so I  wrote an
>importer to pull stuff from cwiki, which is what then started this  thread.
>
> ps: the license on all our code is asl 2.0
>
> >
> >  Jacques
> >
> > From: "Joe Schaefer" <[hidden email]>
> >>  The url is here
> >> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
> >>  and is publicly readable.
> >>
> >>
> >>
> >>  ----- Original Message ----
> >>> From: Adam Heath <[hidden email]>
> >>>  To: Jacques Le Roux <[hidden email]>
> >>>  Cc: [hidden email]; [hidden email]
> >>>  Sent: Tue, September 21, 2010 3:42:35 PM
> >>> Subject: Re: ofbiz  wiki(confluence)
> >>>
> >>> On 09/21/2010 02:07 PM,  Jacques Le Roux wrote:
> >>> > From: "Adam Heath" <[hidden email]>
> >>>  >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
> >>>  >>> From: "Adam Heath" <[hidden email]>
> >>>  >>>> So, I need some admin help with cwiki.apache.org, or at least
> >>>  advice.
> >>> >>>> I've got a script that uses xmlrpc to  confluence, and fetches all
> >>> >>>> previous  page(+versions), comments, attachments(+versions), tracks
> >>>  >>>> renames, usernames, and commit messages. I then take all  this
> >>> data and
> >>> >>>> convert it into a  long series of git commits, with the files
> >>> layed  out
> >>> >>>> in a proper webslinger design. The author  of each git commit is the
> >>> >>>> person who changed  the page, added a comment, or uploaded a new
> >>> >>>>  attachment.
> >>> >>
> >>> >> This webslinger  layout is still in flux, as is my script. The basic
> >>> >>  logic works, however, by fetching all meta data, storing most of  the
> >>> >> bulk of that in a temporary cache folder(only for  the duration of the
> >>> >> script), then sorting each item by  date, and replaying the set of
> >>> >> changes one by  one.
> >>> >>
> >>> >> It's optimized by storing  the 'lastFoo' stuff for each
> >>> >>  page/comment/attachment/(title->pageId mapping) as needed, so that  it
> >>> >> can detect newer versions, etc, and not have to do  anything. A
> >>> refresh
> >>> >> after a full  download against the OFBIZ space takes 2 minutes, with
> >>> >>  nothing new to fetch.
> >>> >>
> >>> >>>>  The issue I am having is the confluence installed on cwiki is  old.
> >>> >>>> Newer versions support returning  the
> >>> PageHistorySummary.versionComment
> >>>  >>>> thru the rpc; currently, I have to fall back and do a  screen
> >>> scrape of
> >>> >>>> the  viewpreviousversions.action page.
> >>> >>
> >>>  >> CONFDEV docs definately list a versionComment field on
> >>>  >> PageHistorySummary, that is not exposed in 3.2.0 installed on  cwiki.
> >>> >>
> >>> >>>> Where should I  ask for help on this, getting this new api
> >>>  implemented?
> >>> >>>
> >>> >>> infra  team: [hidden email]
> >>>  >>>
> >>> >>> I put them in copy
> >>>  >>
> >>> >> Thanks. I'm putting more information in this  email; I've left
> >>> >> dev@ofbiz on the cc for this email, as  others might be interested in
> >>> >> what I have  discovered.
> >>> >>
> >>> >>>> I also  have suggestions to make the api more lightweight, when
> >>>  doing
> >>> >>>> incremental updates(which my system  supports).
> >>> >>
> >>> >> Here are the  suggestions:
> >>> >>
> >>> >> I can fetch all  attachments for a page. But the attachment data
> >>> >>  returned doesn't include the current version as a field. I have  to
> >>> >> split the download url(which is sub-optimal; it  includes the current
> >>> >> version as a parameter). It might  be nice to have an
> >>> AttachmentSummary
> >>> >>  type record.
> >>> >>
> >>> >> What if uploads  an attachment, then a new version of the attachment,
> >>> >>  then changes the page, then deletes the attachment? How could I  fetch
> >>> >> that information? I don't see a way to fetch all  attachments for all
> >>> >> time against a particular history.  This is also a problem for deleted
> >>> >> pages, comments, and  labels(probably others).
> >>> >>
> >>> >>  Comments in confluence support editting. Is this history stored,  and
> >>> >> if so, can I get access to it?
> >>>  >>
> >>> >> Are labels versioned?
> >>>  >>
> >>> >> Children of pages are versioned, only because  pages themselves are
> >>> >> versioned, which includes the  value of the parentId at the time the
> >>> >> page was changed.  However, the frontend doesn't let you see older
> >>> >>  children, when looking at previous versions.
> >>>  >>
> >>> >> It'd be nice if when calling getPageHistory, I  could request a subset
> >>> >> of the list, instead of *all*  page versions. If a page has 271
> >>> >> versions, and I have  already fetched them, and the current page has a
> >>> >>  version of 274, then I only really need to fetch 3  PageHistorySummary
> >>> >> records(to get the versionComment  from newer versions of confluence).
> >>> >>
> >>>  >> BlogEntrySummary doesn't include version, but BlogEntry does. And  I
> >>> >> can't fetch old versions of blogs.
> >>>  >>
> >>> >>
> >>>  >>>>
> >>> >>>> As a side note, there is a  severe lack of version comments. This
> >>> >>>> script  ends up producing 3117 commits. Some of those are page
> >>>  >>>> renames/comments/attachments, which don't have a commit  message.
> >>> Most
> >>> >>>> are page commits.  There are only 70 change messages. It'd be
> >>> nice  if
> >>> >>>> people would comment when they change a  page, but I don't see a
> >>> way to
> >>> >>>>  enforce that.
> >>> >>>>
> >>>  >>>
> >>> >
> >>> > Adam,
> >>>  >
> >>> > There is currently a beginning effort to create a CMS  for apache.org
> >>>  > (infrastructure/trunk/projects/cms) is yours related to this  effort?
> >>>
> >>> No, it's not. Based on how much time  I've spent already(started my
> >>> imoprter last friday), and how  familiar I am with ofbiz, it'd
> >>> probably take me a 2 months to  get mostly feature compatible with
> >>> confluence(that's for a  single person working in his spare time).
> >>>
> >>> >  Jacques
> >>> > PS: Not sure how to access to  infrastructure/trunk/projects/cms/README
> >>> > with the rights I  have
> >>>
> >>> You mean it's not public?
> >>>
> >>
> >>
> >>
> >
>
>


     
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Adam Heath-2
On 09/21/2010 04:41 PM, Joe Schaefer wrote:
> Sounds interesting, but for us we require static exports.
> Since you're using flat files that might not be all that
> hard for you to implement.

Why?  What kind of load do you have?  We(brainfood) have survied
slashdotting, without resorting to fancy frontends like varnish.  It's
been written to be nonblocking(no synchronized keywords), use
weak/soft references, and not create sessions until absolutely nescessary.

> Confluence as a CMS has an interesting future ahead of it
> at the ASF.  Right now we have a hard dependency on the
> auto-export plugin,  whose support characteristics prevent
> us from running the latest versions of confluence.  If
> the situation doesn't change over the next few months,
> we'll likely just phase out the CMS aspects of confluence
> and replace it with something that natively supports
> static exports.

We would like to support static exports too, and it might even be
possible, with little effort.  However, it's just not been nescessary
for us, as we've never had a problem with any kind of load whatsoever.

> ----- Original Message ----
>> From: Adam Heath<[hidden email]>
>> To: Jacques Le Roux<[hidden email]>
>> Cc: Joe Schaefer<[hidden email]>; [hidden email];
>> [hidden email]
>> Sent: Tue, September 21, 2010 5:34:10 PM
>> Subject: Re: ofbiz wiki(confluence)
>>
>> On 09/21/2010 03:53 PM, Jacques Le Roux wrote:
>>> Thanks Joe,
>>>
>>> I quickly tried through Subclipse and got an error.
>>> I guess now  Adam has a better idea of what I was talking about.
>>> I mean maybe  Webslinger could be used, just my 2 cts...
>>
>> I will attempt to describe  webslinger for those who haven't ever heard of it
>> before.
>>
>> The major  features(bullet points) of webslinger-core are:
>>
>> * Content data stored as  raw files.  This is to allow normal programs, like
>> grep, find, vim,  dreamweaver, photoshop, git, svn work without modifications.
>>
>> * Makes use  of commons-vfs, and a custom set of layered filesystems.
>>
>> * One layered  filesystem is called 'flat'.  Arbitrary attributes
>> (FileObject.getContent().getAttribute(name)) are stored as path/to/file@, into
>> separate files.  Again, this allows for easy integration with other  systems.
>>
>> * Another layered filesystem is called 'wsvfs'.  This is an  overlay/cow type
>> filesystem, where multiple real filesystems are combined on the  fly, giving
>> merged directory listings, with support for up-copy and  whiteout.  Any point of
>> the tree can 'overlay' any other part of the tree,  altho this feature isn't
>> normally nescessary.
>>
>> * Automatic extension  resolution.  This allows for pretty urls that don't have
>> extensions, and  allow the implementation on the server to be changed as
>> nescessary.   End-users have problems with extensions, so that is hidden.
>>
>> * Any 'path'  can be configured to do it's own sub-path management. This allows
>> for nice urls  like /shop/product/$productId/detail and
>> /shop/cart/add/$productId and  /Login/Path/To/Protected/Page.  These urls then
>> show up nicely in hit  reports.  They are also easier for end-users to remember.
>>
>> *  Automatic attribute inheritance.  Extensions are used to find the mime-type
>> of a file.  Or the mime-type can be set directly on the file.  Then,  any
>> attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are  inherited
>> for the resource in question.  This allows mapping all ${page}.cf  to
>> application/x-server-side-confluence, creating an attribute called 'type'  with
>> a value of 'confluence-page'.  More on the types in a bit.
>>
>> *  Every resource has a type, and a handler.  Standard types are jsp, cgi,
>> binary.  Base types are event(bsf-based), code, template.  Type can  also be
>> servlet, or, even more advanced(but not ready to be released) is  'vaadin' as a
>> type.
>>
>> * Several languages are integrated: template:  freemarker/velocity/text,
>> bsf+code:  groovy/janino(java)/jython/rhino/bsh/quercus(php).
>>
>> * Macros called by a  template language can be implemented in *any* webslinger
>> resource(any type, any  language).  Each integrated template type has proxies
>> implemented that  allow it to call back into webslinger macros.  velocity->
>> #Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker->  <@Merge
>> path="/path/to-file" template0="/template/to/wrap/it/with"/>.  Support  for
>> macros with content bodies is fully supported as well.
>>
>> * Support for  one-type 'wrapper' of a text output, and then different page
>> styles.   Partial-ajax page updates can then skip this, and do smart updates of
>> regions of  the browser.
>>
>> The above list is an non-inclusive list of features in  webslinger-core.  It's
>> really generic, and not tied to any particular  implementation.
>>
>> The other major thing different about it, is that  webslinger is *itself* a
>> servlet container, just like catalina or glashfish.  However, what sets it apart
>>from all others, is that it doesn't run standalone;  instead, it is installed
>> into a parent container.  It then fakes/wraps  everything, to support it's fancy
>> stuff.  It supports running standard  servlets, but then get backed by
>> commons-vfs, with overlay support, etc.   This implementation isn't perfect, and
>> really needs to be improved  upon.
>>
>> I've been working on a demo for the ofbiz community to play with.  However, the
>> existing embedded site in the repository was rather small, so I  wrote an
>> importer to pull stuff from cwiki, which is what then started this  thread.
>>
>> ps: the license on all our code is asl 2.0
>>
>>>
>>>   Jacques
>>>
>>> From: "Joe Schaefer"<[hidden email]>
>>>>   The url is here
>>>> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
>>>>   and is publicly readable.
>>>>
>>>>
>>>>
>>>>   ----- Original Message ----
>>>>> From: Adam Heath<[hidden email]>
>>>>>   To: Jacques Le Roux<[hidden email]>
>>>>>   Cc: [hidden email]; [hidden email]
>>>>>   Sent: Tue, September 21, 2010 3:42:35 PM
>>>>> Subject: Re: ofbiz  wiki(confluence)
>>>>>
>>>>> On 09/21/2010 02:07 PM,  Jacques Le Roux wrote:
>>>>>> From: "Adam Heath"<[hidden email]>
>>>>>   >>  On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
>>>>>   >>>  From: "Adam Heath"<[hidden email]>
>>>>>   >>>>  So, I need some admin help with cwiki.apache.org, or at least
>>>>>   advice.
>>>>>>>>> I've got a script that uses xmlrpc to  confluence, and fetches all
>>>>>>>>> previous  page(+versions), comments, attachments(+versions), tracks
>>>>>   >>>>  renames, usernames, and commit messages. I then take all  this
>>>>> data and
>>>>>>>>> convert it into a  long series of git commits, with the files
>>>>> layed  out
>>>>>>>>> in a proper webslinger design. The author  of each git commit is the
>>>>>>>>> person who changed  the page, added a comment, or uploaded a new
>>>>>>>>>   attachment.
>>>>>>>
>>>>>>> This webslinger  layout is still in flux, as is my script. The basic
>>>>>>>   logic works, however, by fetching all meta data, storing most of  the
>>>>>>> bulk of that in a temporary cache folder(only for  the duration of the
>>>>>>> script), then sorting each item by  date, and replaying the set of
>>>>>>> changes one by  one.
>>>>>>>
>>>>>>> It's optimized by storing  the 'lastFoo' stuff for each
>>>>>>>   page/comment/attachment/(title->pageId mapping) as needed, so that  it
>>>>>>> can detect newer versions, etc, and not have to do  anything. A
>>>>> refresh
>>>>>>> after a full  download against the OFBIZ space takes 2 minutes, with
>>>>>>>   nothing new to fetch.
>>>>>>>
>>>>>>>>>   The issue I am having is the confluence installed on cwiki is  old.
>>>>>>>>> Newer versions support returning  the
>>>>> PageHistorySummary.versionComment
>>>>>   >>>>  thru the rpc; currently, I have to fall back and do a  screen
>>>>> scrape of
>>>>>>>>> the  viewpreviousversions.action page.
>>>>>>>
>>>>>   >>  CONFDEV docs definately list a versionComment field on
>>>>>   >>  PageHistorySummary, that is not exposed in 3.2.0 installed on  cwiki.
>>>>>>>
>>>>>>>>> Where should I  ask for help on this, getting this new api
>>>>>   implemented?
>>>>>>>>
>>>>>>>> infra  team: [hidden email]
>>>>>   >>>
>>>>>>>> I put them in copy
>>>>>   >>
>>>>>>> Thanks. I'm putting more information in this  email; I've left
>>>>>>> dev@ofbiz on the cc for this email, as  others might be interested in
>>>>>>> what I have  discovered.
>>>>>>>
>>>>>>>>> I also  have suggestions to make the api more lightweight, when
>>>>>   doing
>>>>>>>>> incremental updates(which my system  supports).
>>>>>>>
>>>>>>> Here are the  suggestions:
>>>>>>>
>>>>>>> I can fetch all  attachments for a page. But the attachment data
>>>>>>>   returned doesn't include the current version as a field. I have  to
>>>>>>> split the download url(which is sub-optimal; it  includes the current
>>>>>>> version as a parameter). It might  be nice to have an
>>>>> AttachmentSummary
>>>>>>>   type record.
>>>>>>>
>>>>>>> What if uploads  an attachment, then a new version of the attachment,
>>>>>>>   then changes the page, then deletes the attachment? How could I  fetch
>>>>>>> that information? I don't see a way to fetch all  attachments for all
>>>>>>> time against a particular history.  This is also a problem for deleted
>>>>>>> pages, comments, and  labels(probably others).
>>>>>>>
>>>>>>>   Comments in confluence support editting. Is this history stored,  and
>>>>>>> if so, can I get access to it?
>>>>>   >>
>>>>>>> Are labels versioned?
>>>>>   >>
>>>>>>> Children of pages are versioned, only because  pages themselves are
>>>>>>> versioned, which includes the  value of the parentId at the time the
>>>>>>> page was changed.  However, the frontend doesn't let you see older
>>>>>>>   children, when looking at previous versions.
>>>>>   >>
>>>>>>> It'd be nice if when calling getPageHistory, I  could request a subset
>>>>>>> of the list, instead of *all*  page versions. If a page has 271
>>>>>>> versions, and I have  already fetched them, and the current page has a
>>>>>>>   version of 274, then I only really need to fetch 3  PageHistorySummary
>>>>>>> records(to get the versionComment  from newer versions of confluence).
>>>>>>>
>>>>>   >>  BlogEntrySummary doesn't include version, but BlogEntry does. And  I
>>>>>>> can't fetch old versions of blogs.
>>>>>   >>
>>>>>>>
>>>>>   >>>>
>>>>>>>>> As a side note, there is a  severe lack of version comments. This
>>>>>>>>> script  ends up producing 3117 commits. Some of those are page
>>>>>   >>>>  renames/comments/attachments, which don't have a commit  message.
>>>>> Most
>>>>>>>>> are page commits.  There are only 70 change messages. It'd be
>>>>> nice  if
>>>>>>>>> people would comment when they change a  page, but I don't see a
>>>>> way to
>>>>>>>>>   enforce that.
>>>>>>>>>
>>>>>   >>>
>>>>>>
>>>>>> Adam,
>>>>>   >
>>>>>> There is currently a beginning effort to create a CMS  for apache.org
>>>>>   >  (infrastructure/trunk/projects/cms) is yours related to this  effort?
>>>>>
>>>>> No, it's not. Based on how much time  I've spent already(started my
>>>>> imoprter last friday), and how  familiar I am with ofbiz, it'd
>>>>> probably take me a 2 months to  get mostly feature compatible with
>>>>> confluence(that's for a  single person working in his spare time).
>>>>>
>>>>>>   Jacques
>>>>>> PS: Not sure how to access to  infrastructure/trunk/projects/cms/README
>>>>>> with the rights I  have
>>>>>
>>>>> You mean it's not public?
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Joe Schaefer
About 10M hits a day.  No java app hosted on
a single machine would survive for 5 minutes
with our load.



----- Original Message ----

> From: Adam Heath <[hidden email]>
> To: Joe Schaefer <[hidden email]>
> Cc: Jacques Le Roux <[hidden email]>; [hidden email];
>[hidden email]
> Sent: Tue, September 21, 2010 5:48:26 PM
> Subject: Re: ofbiz wiki(confluence)
>
> On 09/21/2010 04:41 PM, Joe Schaefer wrote:
> > Sounds interesting, but for  us we require static exports.
> > Since you're using flat files that might  not be all that
> > hard for you to implement.
>
> Why?  What kind  of load do you have?  We(brainfood) have survied
> slashdotting, without  resorting to fancy frontends like varnish.  It's
> been written to be  nonblocking(no synchronized keywords), use
> weak/soft references, and not  create sessions until absolutely nescessary.
>
> > Confluence as a CMS has  an interesting future ahead of it
> > at the ASF.  Right now we have a  hard dependency on the
> > auto-export plugin,  whose support  characteristics prevent
> > us from running the latest versions of  confluence.  If
> > the situation doesn't change over the next few  months,
> > we'll likely just phase out the CMS aspects of  confluence
> > and replace it with something that natively supports
> >  static exports.
>
> We would like to support static exports too, and it might  even be
> possible, with little effort.  However, it's just not been  nescessary
> for us, as we've never had a problem with any kind of load  whatsoever.
>
> > ----- Original Message ----
> >> From: Adam  Heath<[hidden email]>
> >> To:  Jacques Le Roux<[hidden email]>
> >>  Cc: Joe Schaefer<[hidden email]>; [hidden email];
> >> [hidden email]
> >>  Sent: Tue, September 21, 2010 5:34:10 PM
> >> Subject: Re: ofbiz  wiki(confluence)
> >>
> >> On 09/21/2010 03:53 PM, Jacques Le Roux  wrote:
> >>> Thanks Joe,
> >>>
> >>> I quickly  tried through Subclipse and got an error.
> >>> I guess now  Adam  has a better idea of what I was talking about.
> >>> I mean  maybe  Webslinger could be used, just my 2 cts...
> >>
> >> I  will attempt to describe  webslinger for those who haven't ever heard of  
>it
> >> before.
> >>
> >> The major  features(bullet  points) of webslinger-core are:
> >>
> >> * Content data stored  as  raw files.  This is to allow normal programs,
>like
> >>  grep, find, vim,  dreamweaver, photoshop, git, svn work without  
>modifications.
> >>
> >> * Makes use  of commons-vfs, and a  custom set of layered filesystems.
> >>
> >> * One layered   filesystem is called 'flat'.  Arbitrary attributes
> >>  (FileObject.getContent().getAttribute(name)) are stored as path/to/file@,  
>into
> >> separate files.  Again, this allows for easy integration  with other  
>systems.
> >>
> >> * Another layered filesystem  is called 'wsvfs'.  This is an  overlay/cow
>type
> >>  filesystem, where multiple real filesystems are combined on the  fly,  
>giving
> >> merged directory listings, with support for up-copy and   whiteout.  Any
>point of
> >> the tree can 'overlay' any other part  of the tree,  altho this feature
>isn't
> >> normally  nescessary.
> >>
> >> * Automatic extension   resolution.  This allows for pretty urls that don't
>have
> >>  extensions, and  allow the implementation on the server to be changed  as
> >> nescessary.   End-users have problems with extensions, so  that is hidden.
> >>
> >> * Any 'path'  can be configured to  do it's own sub-path management. This
>allows
> >> for nice urls   like /shop/product/$productId/detail and
> >> /shop/cart/add/$productId  and  /Login/Path/To/Protected/Page.  These urls
>then
> >> show  up nicely in hit  reports.  They are also easier for end-users to  
>remember.
> >>
> >> *  Automatic attribute inheritance.   Extensions are used to find the
>mime-type
> >> of a file.  Or the  mime-type can be set directly on the file.  Then,  any
> >>  attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are  
>inherited
> >> for the resource in question.  This allows mapping  all ${page}.cf  to
> >> application/x-server-side-confluence,  creating an attribute called 'type'  
>with
> >> a value of  'confluence-page'.  More on the types in a bit.
> >>
> >>  *  Every resource has a type, and a handler.  Standard types are jsp,  
cgi,

> >> binary.  Base types are event(bsf-based), code,  template.  Type can  also
>be
> >> servlet, or, even more  advanced(but not ready to be released) is  'vaadin'
>as a
> >>  type.
> >>
> >> * Several languages are integrated:  template:  freemarker/velocity/text,
> >> bsf+code:   groovy/janino(java)/jython/rhino/bsh/quercus(php).
> >>
> >> *  Macros called by a  template language can be implemented in *any*  
>webslinger
> >> resource(any type, any  language).  Each  integrated template type has
>proxies
> >> implemented that  allow it  to call back into webslinger macros.  
velocity->
> >>  #Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker->  
><@Merge
> >> path="/path/to-file"  template0="/template/to/wrap/it/with"/>.  Support  
for

> >>  macros with content bodies is fully supported as well.
> >>
> >> *  Support for  one-type 'wrapper' of a text output, and then different  
>page
> >> styles.   Partial-ajax page updates can then skip this,  and do smart
>updates of
> >> regions of  the  browser.
> >>
> >> The above list is an non-inclusive list of  features in  webslinger-core.  
>It's
> >> really generic, and  not tied to any particular  implementation.
> >>
> >> The  other major thing different about it, is that  webslinger is *itself*  
>a
> >> servlet container, just like catalina or glashfish.  However,  what sets it
>apart
> >>from all others, is that it doesn't run  standalone;  instead, it is
>installed
> >> into a parent  container.  It then fakes/wraps  everything, to support it's  
>fancy
> >> stuff.  It supports running standard  servlets, but  then get backed by
> >> commons-vfs, with overlay support, etc.    This implementation isn't
>perfect, and
> >> really needs to be  improved  upon.
> >>
> >> I've been working on a demo for the  ofbiz community to play with.  However,
>the
> >> existing embedded  site in the repository was rather small, so I  wrote an
> >>  importer to pull stuff from cwiki, which is what then started this  
>thread.
> >>
> >> ps: the license on all our code is asl  2.0
> >>
> >>>
> >>>    Jacques
> >>>
> >>> From: "Joe Schaefer"<[hidden email]>
> >>>>    The url is here
> >>>> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms
> >>>>    and is publicly  readable.
> >>>>
> >>>>
> >>>>
> >>>>    ----- Original Message ----
> >>>>> From: Adam Heath<[hidden email]>
> >>>>>    To: Jacques Le Roux<[hidden email]>
> >>>>>    Cc: [hidden email]; [hidden email]
> >>>>>    Sent: Tue, September 21, 2010 3:42:35 PM
> >>>>> Subject: Re:  ofbiz  wiki(confluence)
> >>>>>
> >>>>> On  09/21/2010 02:07 PM,  Jacques Le Roux wrote:
> >>>>>>  From: "Adam Heath"<[hidden email]>
> >>>>>    >>  On 09/21/2010 11:53 AM, Jacques Le Roux  wrote:
> >>>>>   >>>  From: "Adam  Heath"<[hidden email]>
> >>>>>    >>>>  So, I need some admin help with cwiki.apache.org, or at  least
> >>>>>    advice.
> >>>>>>>>> I've got a script that uses  xmlrpc to  confluence, and fetches  all
> >>>>>>>>> previous  page(+versions),  comments, attachments(+versions), tracks
> >>>>>    >>>>  renames, usernames, and commit messages. I then take  all  this
> >>>>> data  and
> >>>>>>>>> convert it into a  long series  of git commits, with the files
> >>>>> layed   out
> >>>>>>>>> in a proper webslinger design. The  author  of each git commit is
the

> >>>>>>>>>  person who changed  the page, added a comment, or uploaded a  new
> >>>>>>>>>    attachment.
> >>>>>>>
> >>>>>>> This  webslinger  layout is still in flux, as is my script. The  basic
> >>>>>>>   logic works, however, by fetching  all meta data, storing most of  
>the
> >>>>>>> bulk of  that in a temporary cache folder(only for  the duration of  
>the
> >>>>>>> script), then sorting each item by   date, and replaying the set of
> >>>>>>> changes one  by  one.
> >>>>>>>
> >>>>>>>  It's optimized by storing  the 'lastFoo' stuff for  each
> >>>>>>>    page/comment/attachment/(title->pageId mapping) as needed, so that  
>it
> >>>>>>> can detect newer versions, etc, and not have  to do  anything. A
> >>>>>  refresh
> >>>>>>> after a full  download against the  OFBIZ space takes 2 minutes, with
> >>>>>>>   nothing  new to  fetch.
> >>>>>>>
> >>>>>>>>>    The issue I am having is the confluence installed on cwiki is  
>old.
> >>>>>>>>> Newer versions support  returning  the
> >>>>>  PageHistorySummary.versionComment
> >>>>>    >>>>  thru the rpc; currently, I have to fall back and do  a  screen
> >>>>> scrape  of
> >>>>>>>>> the  viewpreviousversions.action  page.
> >>>>>>>
> >>>>>    >>  CONFDEV docs definately list a versionComment field  on
> >>>>>   >>  PageHistorySummary, that is not  exposed in 3.2.0 installed on  
>cwiki.
> >>>>>>>
> >>>>>>>>>  Where should I  ask for help on this, getting this new  api
> >>>>>    implemented?
> >>>>>>>>
> >>>>>>>>  infra  team: [hidden email]
> >>>>>    >>>
> >>>>>>>> I put them in  copy
> >>>>>   >>
> >>>>>>>  Thanks. I'm putting more information in this  email; I've  left
> >>>>>>> dev@ofbiz on the cc for this email,  as  others might be interested in
> >>>>>>> what I  have   discovered.
> >>>>>>>
> >>>>>>>>>  I also  have suggestions to make the api more lightweight,  when
> >>>>>    doing
> >>>>>>>>> incremental updates(which my  system   supports).
> >>>>>>>
> >>>>>>> Here  are the   suggestions:
> >>>>>>>
> >>>>>>> I  can fetch all  attachments for a page. But the attachment  data
> >>>>>>>   returned doesn't include the current  version as a field. I have  to
> >>>>>>> split the  download url(which is sub-optimal; it  includes the  
current
> >>>>>>> version as a parameter). It might   be nice to have an
> >>>>>  AttachmentSummary
> >>>>>>>   type  record.
> >>>>>>>
> >>>>>>> What if  uploads  an attachment, then a new version of the  
attachment,
> >>>>>>>   then changes the page, then  deletes the attachment? How could I  
>fetch
> >>>>>>>  that information? I don't see a way to fetch all  attachments for  
all
> >>>>>>> time against a particular history.   This is also a problem for
deleted

> >>>>>>> pages,  comments, and  labels(probably  others).
> >>>>>>>
> >>>>>>>    Comments in confluence support editting. Is this history stored,  
>and
> >>>>>>> if so, can I get access to  it?
> >>>>>   >>
> >>>>>>> Are  labels versioned?
> >>>>>    >>
> >>>>>>> Children of pages are versioned, only  because  pages themselves are
> >>>>>>> versioned,  which includes the  value of the parentId at the time  the
> >>>>>>> page was changed.  However, the  frontend doesn't let you see older
> >>>>>>>    children, when looking at previous versions.
> >>>>>    >>
> >>>>>>> It'd be nice if when calling  getPageHistory, I  could request a
subset
> >>>>>>>  of the list, instead of *all*  page versions. If a page has  271
> >>>>>>> versions, and I have  already fetched  them, and the current page has
a

> >>>>>>>   version  of 274, then I only really need to fetch 3  
>PageHistorySummary
> >>>>>>> records(to get the  versionComment  from newer versions of  
>confluence).
> >>>>>>>
> >>>>>    >>  BlogEntrySummary doesn't include version, but BlogEntry does.  
>And  I
> >>>>>>> can't fetch old versions of  blogs.
> >>>>>    >>
> >>>>>>>
> >>>>>    >>>>
> >>>>>>>>> As a side note, there  is a  severe lack of version comments.  This
> >>>>>>>>> script  ends up producing 3117  commits. Some of those are page
> >>>>>    >>>>  renames/comments/attachments, which don't have a  commit  
>message.
> >>>>>  Most
> >>>>>>>>> are page commits.  There are  only 70 change messages. It'd be
> >>>>> nice   if
> >>>>>>>>> people would comment when they change  a  page, but I don't see a
> >>>>> way  to
> >>>>>>>>>   enforce  that.
> >>>>>>>>>
> >>>>>    >>>
> >>>>>>
> >>>>>>  Adam,
> >>>>>   >
> >>>>>> There is  currently a beginning effort to create a CMS  for apache.org
> >>>>>    >  (infrastructure/trunk/projects/cms) is yours related to this  
>effort?
> >>>>>
> >>>>> No, it's not. Based on  how much time  I've spent already(started my
> >>>>>  imoprter last friday), and how  familiar I am with ofbiz,  it'd
> >>>>> probably take me a 2 months to  get mostly  feature compatible with
> >>>>> confluence(that's for a   single person working in his spare  time).
> >>>>>
> >>>>>>    Jacques
> >>>>>> PS: Not sure how to access to  
infrastructure/trunk/projects/cms/README

> >>>>>> with the  rights I  have
> >>>>>
> >>>>> You mean  it's not  public?
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
> >
> >
>
>


     
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Adam Heath-2
On 09/21/2010 04:50 PM, Joe Schaefer wrote:
> About 10M hits a day.  No java app hosted on
> a single machine would survive for 5 minutes
> with our load.

Are those just page requests(html), or everything(images+css+other files)?
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Joe Schaefer
I don't recall the breakdown, the 10M figure
counts total daily traffic.  You could look
at Vadim's stats for more details- the bottom
line is that no app that doesn't support static
exports will function as a suitable CMS for
Apache, which is why we're rolling our own.



----- Original Message ----

> From: Adam Heath <[hidden email]>
> To: Joe Schaefer <[hidden email]>
> Cc: Jacques Le Roux <[hidden email]>; [hidden email];
>[hidden email]
> Sent: Tue, September 21, 2010 6:11:53 PM
> Subject: Re: ofbiz wiki(confluence)
>
> On 09/21/2010 04:50 PM, Joe Schaefer wrote:
> > About 10M hits a day.   No java app hosted on
> > a single machine would survive for 5  minutes
> > with our load.
>
> Are those just page requests(html), or  everything(images+css+other files)?
>


     
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Adam Heath-2
On 09/21/2010 05:15 PM, Joe Schaefer wrote:
> I don't recall the breakdown, the 10M figure
> counts total daily traffic.  You could look
> at Vadim's stats for more details- the bottom
> line is that no app that doesn't support static
> exports will function as a suitable CMS for
> Apache, which is why we're rolling our own.

Ok, I've looked.  10M seems to be *all* requests.  That number doesn't
scare me.  I've got a site live right now, http://www.hailmerry.com/,
that is reporting 190req/s with ab(from apache http) while on
localhost.  That's without any fancy supercache sitting in front.
This site supports online live editting of content.  Anonymous users
have no session, and we make a point of reducing database access for
hot-points.

This is running on a single large shared iscsi disk host.  The cpu
node is an 8-way 2.33Ghz cpu, using xen, with the domU given 512M and
a single cpu.  The hosting framework is no where near what could be
consider super-fast.

The main feature, that I haven't mentioned before now, that helps with
this, is a thing we have designed called TTLObject.  It is designed to
protect method calls, by saving their results, and returning old
values for a certain amount of time.  It is non-blocking, uses a state
engine internally.  It was overhauled to follow the design patterns in
Java Concurrency in Practice.

Again, I'm not afraid of those numbers.  Our cms stuff(which isn't
quite up to the same level of feature integration as confluence),
stores most data as raw files(to allow svn/git tracking of history).
The database is not used for most functions.
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Adam Heath-2
On 09/21/2010 05:45 PM, Adam Heath wrote:

> On 09/21/2010 05:15 PM, Joe Schaefer wrote:
>> I don't recall the breakdown, the 10M figure
>> counts total daily traffic. You could look
>> at Vadim's stats for more details- the bottom
>> line is that no app that doesn't support static
>> exports will function as a suitable CMS for
>> Apache, which is why we're rolling our own.
>
> Ok, I've looked. 10M seems to be *all* requests. That number doesn't
> scare me. I've got a site live right now, http://www.hailmerry.com/,
> that is reporting 190req/s with ab(from apache http) while on localhost.
> That's without any fancy supercache sitting in front. This site supports
> online live editting of content. Anonymous users have no session, and we
> make a point of reducing database access for hot-points.
>
> This is running on a single large shared iscsi disk host. The cpu node
> is an 8-way 2.33Ghz cpu, using xen, with the domU given 512M and a
> single cpu. The hosting framework is no where near what could be
> consider super-fast.
>
> The main feature, that I haven't mentioned before now, that helps with
> this, is a thing we have designed called TTLObject. It is designed to
> protect method calls, by saving their results, and returning old values
> for a certain amount of time. It is non-blocking, uses a state engine
> internally. It was overhauled to follow the design patterns in Java
> Concurrency in Practice.
>
> Again, I'm not afraid of those numbers. Our cms stuff(which isn't quite
> up to the same level of feature integration as confluence), stores most
> data as raw files(to allow svn/git tracking of history). The database is
> not used for most functions.

I'm still trying to  understand the distribution of load.  If my
understanding is wrong, then please tell me.  It'll give me a target
to strive for.  I've looked at more graphs at (1), and I see a
breakdown of page views per major sub-site(host).

Our current framework underwent a major rewrite 4 years ago, and it
was first deployed into a production state 3.5 years ago.  That
particular site did 3,500,000 total requests the day after it went
live, 500,000 page views.  There was no super-cache in front of it.
The version of the software at the time had a bug, where
If-Modified-Since processing didn't work, so all images were always
fetched by clients.  Our filesystem code has been rewritten, to be
non-blocking(no synchronized keywords), the cow/overlay feature has
had a second-order of speedups, plus other speed fixes.  If I
'flatten' the cow/overlay system, so it's not used, the system easily
approached 1000req/s(single page at a time).

So, I'm still not afraid of these numbers.

Now, if you were to combine *all* these hosts into one, and then try
to run it, we might have an issue.  But, that's not what is currently
happening.

And, with the 10M number you originally gave, that is 115req/s.  If
that is for a single page, then our current software handles that
fine.  If that is for 115 different pages at the same time, then I
will have to get back to you, to try that test.  I don't have a
program that can request 115 different pages at once.

Also, if the problem is with remote clients tying up a
thread/connection slot, then that is a separate problem from the
backend system.  The backend system should have a small thread pool,
so that it can run fast, and then the frontend(either catalina itself,
or apache/mod-jk) does the send using non-blocking-io.

1: http://people.apache.org/~vgritsenko/stats/index.html
Reply | Threaded
Open this post in threaded view
|

Re: ofbiz wiki(confluence)

Mark Thomas
On 22/09/2010 09:09, Adam Heath wrote:
> Now, if you were to combine *all* these hosts into one, and then try to
> run it, we might have an issue.  But, that's not what is currently
> happening.

Not sure what you mean here. www.a.o and every tlp.a.o site are served
from a single httpd instance (along with a handful of other virtual
hosts). Everything we require the cms to handle is currently handled by
a single httpd instance.

There are actually two machines. One in the EU and one in the US.
Normally we use geo-based load balancing but a single machine has to be
able to handle all of the traffic comfortably so we can do maintenance
on the other.

> And, with the 10M number you originally gave, that is 115req/s.  If that
> is for a single page, then our current software handles that fine.  If
> that is for 115 different pages at the same time, then I will have to
> get back to you, to try that test.  I don't have a program that can
> request 115 different pages at once.

That is across all virtual hosts so they will be different pages.
http://www.apache.org/server-status will give you a snapshot of current
load.

> Also, if the problem is with remote clients tying up a thread/connection
> slot, then that is a separate problem from the backend system.  The
> backend system should have a small thread pool, so that it can run fast,
> and then the frontend(either catalina itself, or apache/mod-jk) does the
> send using non-blocking-io.

The problem is that the systems we have tried before can't handle the load.

It sounds like you are using Tomcat under the covers. Whilst you will
maximise throughput when the threadpool size is roughly the same as the
number of cores on the machine, the overhead of having a few hundred
processing threads isn't that great. We would always front Tomcat with
httpd so an appropriate mod_proxy/mod_jk/Tomcat connector config can
ensure that we don't have to have one Tomcat thread per current
connection (since with keep-alive connections >> requests).

Mark