So, I need some admin help with cwiki.apache.org, or at least advice.
I've got a script that uses xmlrpc to confluence, and fetches all previous page(+versions), comments, attachments(+versions), tracks renames, usernames, and commit messages. I then take all this data and convert it into a long series of git commits, with the files layed out in a proper webslinger design. The author of each git commit is the person who changed the page, added a comment, or uploaded a new attachment. The issue I am having is the confluence installed on cwiki is old. Newer versions support returning the PageHistorySummary.versionComment thru the rpc; currently, I have to fall back and do a screen scrape of the viewpreviousversions.action page. Where should I ask for help on this, getting this new api implemented? I also have suggestions to make the api more lightweight, when doing incremental updates(which my system supports). As a side note, there is a severe lack of version comments. This script ends up producing 3117 commits. Some of those are page renames/comments/attachments, which don't have a commit message. Most are page commits. There are only 70 change messages. It'd be nice if people would comment when they change a page, but I don't see a way to enforce that. |
Administrator
|
From: "Adam Heath" <[hidden email]>
> So, I need some admin help with cwiki.apache.org, or at least advice. > I've got a script that uses xmlrpc to confluence, and fetches all > previous page(+versions), comments, attachments(+versions), tracks > renames, usernames, and commit messages. I then take all this data > and convert it into a long series of git commits, with the files layed > out in a proper webslinger design. The author of each git commit is > the person who changed the page, added a comment, or uploaded a new > attachment. > > The issue I am having is the confluence installed on cwiki is old. > Newer versions support returning the PageHistorySummary.versionComment > thru the rpc; currently, I have to fall back and do a screen scrape of > the viewpreviousversions.action page. > > Where should I ask for help on this, getting this new api implemented? infra team: [hidden email] I put them in copy Jacques > I also have suggestions to make the api more lightweight, when doing > incremental updates(which my system supports). > > As a side note, there is a severe lack of version comments. This > script ends up producing 3117 commits. Some of those are page > renames/comments/attachments, which don't have a commit message. Most > are page commits. There are only 70 change messages. It'd be nice if > people would comment when they change a page, but I don't see a way to > enforce that. > |
On 09/21/2010 11:53 AM, Jacques Le Roux wrote:
> From: "Adam Heath" <[hidden email]> >> So, I need some admin help with cwiki.apache.org, or at least advice. >> I've got a script that uses xmlrpc to confluence, and fetches all >> previous page(+versions), comments, attachments(+versions), tracks >> renames, usernames, and commit messages. I then take all this data and >> convert it into a long series of git commits, with the files layed out >> in a proper webslinger design. The author of each git commit is the >> person who changed the page, added a comment, or uploaded a new >> attachment. This webslinger layout is still in flux, as is my script. The basic logic works, however, by fetching all meta data, storing most of the bulk of that in a temporary cache folder(only for the duration of the script), then sorting each item by date, and replaying the set of changes one by one. It's optimized by storing the 'lastFoo' stuff for each page/comment/attachment/(title->pageId mapping) as needed, so that it can detect newer versions, etc, and not have to do anything. A refresh after a full download against the OFBIZ space takes 2 minutes, with nothing new to fetch. >> The issue I am having is the confluence installed on cwiki is old. >> Newer versions support returning the PageHistorySummary.versionComment >> thru the rpc; currently, I have to fall back and do a screen scrape of >> the viewpreviousversions.action page. CONFDEV docs definately list a versionComment field on PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki. >> Where should I ask for help on this, getting this new api implemented? > > infra team: [hidden email] > > I put them in copy Thanks. I'm putting more information in this email; I've left dev@ofbiz on the cc for this email, as others might be interested in what I have discovered. >> I also have suggestions to make the api more lightweight, when doing >> incremental updates(which my system supports). Here are the suggestions: I can fetch all attachments for a page. But the attachment data returned doesn't include the current version as a field. I have to split the download url(which is sub-optimal; it includes the current version as a parameter). It might be nice to have an AttachmentSummary type record. What if uploads an attachment, then a new version of the attachment, then changes the page, then deletes the attachment? How could I fetch that information? I don't see a way to fetch all attachments for all time against a particular history. This is also a problem for deleted pages, comments, and labels(probably others). Comments in confluence support editting. Is this history stored, and if so, can I get access to it? Are labels versioned? Children of pages are versioned, only because pages themselves are versioned, which includes the value of the parentId at the time the page was changed. However, the frontend doesn't let you see older children, when looking at previous versions. It'd be nice if when calling getPageHistory, I could request a subset of the list, instead of *all* page versions. If a page has 271 versions, and I have already fetched them, and the current page has a version of 274, then I only really need to fetch 3 PageHistorySummary records(to get the versionComment from newer versions of confluence). BlogEntrySummary doesn't include version, but BlogEntry does. And I can't fetch old versions of blogs. >> >> As a side note, there is a severe lack of version comments. This >> script ends up producing 3117 commits. Some of those are page >> renames/comments/attachments, which don't have a commit message. Most >> are page commits. There are only 70 change messages. It'd be nice if >> people would comment when they change a page, but I don't see a way to >> enforce that. >> > |
Administrator
|
From: "Adam Heath" <[hidden email]>
> On 09/21/2010 11:53 AM, Jacques Le Roux wrote: >> From: "Adam Heath" <[hidden email]> >>> So, I need some admin help with cwiki.apache.org, or at least advice. >>> I've got a script that uses xmlrpc to confluence, and fetches all >>> previous page(+versions), comments, attachments(+versions), tracks >>> renames, usernames, and commit messages. I then take all this data and >>> convert it into a long series of git commits, with the files layed out >>> in a proper webslinger design. The author of each git commit is the >>> person who changed the page, added a comment, or uploaded a new >>> attachment. > > This webslinger layout is still in flux, as is my script. The basic logic works, however, by fetching all meta data, storing most > of the bulk of that in a temporary cache folder(only for the duration of the script), then sorting each item by date, and > replaying the set of changes one by one. > > It's optimized by storing the 'lastFoo' stuff for each page/comment/attachment/(title->pageId mapping) as needed, so that it can > detect newer versions, etc, and not have to do anything. A refresh after a full download against the OFBIZ space takes 2 minutes, > with nothing new to fetch. > >>> The issue I am having is the confluence installed on cwiki is old. >>> Newer versions support returning the PageHistorySummary.versionComment >>> thru the rpc; currently, I have to fall back and do a screen scrape of >>> the viewpreviousversions.action page. > > CONFDEV docs definately list a versionComment field on PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki. > >>> Where should I ask for help on this, getting this new api implemented? >> >> infra team: [hidden email] >> >> I put them in copy > > Thanks. I'm putting more information in this email; I've left dev@ofbiz on the cc for this email, as others might be interested > in what I have discovered. > >>> I also have suggestions to make the api more lightweight, when doing >>> incremental updates(which my system supports). > > Here are the suggestions: > > I can fetch all attachments for a page. But the attachment data returned doesn't include the current version as a field. I have > to split the download url(which is sub-optimal; it includes the current version as a parameter). It might be nice to have an > AttachmentSummary type record. > > What if uploads an attachment, then a new version of the attachment, then changes the page, then deletes the attachment? How > could I fetch that information? I don't see a way to fetch all attachments for all time against a particular history. This is > also a problem for deleted pages, comments, and labels(probably others). > > Comments in confluence support editting. Is this history stored, and if so, can I get access to it? > > Are labels versioned? > > Children of pages are versioned, only because pages themselves are versioned, which includes the value of the parentId at the time > the page was changed. However, the frontend doesn't let you see older children, when looking at previous versions. > > It'd be nice if when calling getPageHistory, I could request a subset of the list, instead of *all* page versions. If a page has > 271 versions, and I have already fetched them, and the current page has a version of 274, then I only really need to fetch 3 > PageHistorySummary records(to get the versionComment from newer versions of confluence). > > BlogEntrySummary doesn't include version, but BlogEntry does. And I can't fetch old versions of blogs. > > >>> >>> As a side note, there is a severe lack of version comments. This >>> script ends up producing 3117 commits. Some of those are page >>> renames/comments/attachments, which don't have a commit message. Most >>> are page commits. There are only 70 change messages. It'd be nice if >>> people would comment when they change a page, but I don't see a way to >>> enforce that. >>> >> Adam, There is currently a beginning effort to create a CMS for apache.org (infrastructure/trunk/projects/cms) is yours related to this effort? Jacques PS: Not sure how to access to infrastructure/trunk/projects/cms/README with the rights I have |
On 09/21/2010 02:07 PM, Jacques Le Roux wrote:
> From: "Adam Heath" <[hidden email]> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote: >>> From: "Adam Heath" <[hidden email]> >>>> So, I need some admin help with cwiki.apache.org, or at least advice. >>>> I've got a script that uses xmlrpc to confluence, and fetches all >>>> previous page(+versions), comments, attachments(+versions), tracks >>>> renames, usernames, and commit messages. I then take all this data and >>>> convert it into a long series of git commits, with the files layed out >>>> in a proper webslinger design. The author of each git commit is the >>>> person who changed the page, added a comment, or uploaded a new >>>> attachment. >> >> This webslinger layout is still in flux, as is my script. The basic >> logic works, however, by fetching all meta data, storing most of the >> bulk of that in a temporary cache folder(only for the duration of the >> script), then sorting each item by date, and replaying the set of >> changes one by one. >> >> It's optimized by storing the 'lastFoo' stuff for each >> page/comment/attachment/(title->pageId mapping) as needed, so that it >> can detect newer versions, etc, and not have to do anything. A refresh >> after a full download against the OFBIZ space takes 2 minutes, with >> nothing new to fetch. >> >>>> The issue I am having is the confluence installed on cwiki is old. >>>> Newer versions support returning the PageHistorySummary.versionComment >>>> thru the rpc; currently, I have to fall back and do a screen scrape of >>>> the viewpreviousversions.action page. >> >> CONFDEV docs definately list a versionComment field on >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki. >> >>>> Where should I ask for help on this, getting this new api implemented? >>> >>> infra team: [hidden email] >>> >>> I put them in copy >> >> Thanks. I'm putting more information in this email; I've left >> dev@ofbiz on the cc for this email, as others might be interested in >> what I have discovered. >> >>>> I also have suggestions to make the api more lightweight, when doing >>>> incremental updates(which my system supports). >> >> Here are the suggestions: >> >> I can fetch all attachments for a page. But the attachment data >> returned doesn't include the current version as a field. I have to >> split the download url(which is sub-optimal; it includes the current >> version as a parameter). It might be nice to have an AttachmentSummary >> type record. >> >> What if uploads an attachment, then a new version of the attachment, >> then changes the page, then deletes the attachment? How could I fetch >> that information? I don't see a way to fetch all attachments for all >> time against a particular history. This is also a problem for deleted >> pages, comments, and labels(probably others). >> >> Comments in confluence support editting. Is this history stored, and >> if so, can I get access to it? >> >> Are labels versioned? >> >> Children of pages are versioned, only because pages themselves are >> versioned, which includes the value of the parentId at the time the >> page was changed. However, the frontend doesn't let you see older >> children, when looking at previous versions. >> >> It'd be nice if when calling getPageHistory, I could request a subset >> of the list, instead of *all* page versions. If a page has 271 >> versions, and I have already fetched them, and the current page has a >> version of 274, then I only really need to fetch 3 PageHistorySummary >> records(to get the versionComment from newer versions of confluence). >> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I >> can't fetch old versions of blogs. >> >> >>>> >>>> As a side note, there is a severe lack of version comments. This >>>> script ends up producing 3117 commits. Some of those are page >>>> renames/comments/attachments, which don't have a commit message. Most >>>> are page commits. There are only 70 change messages. It'd be nice if >>>> people would comment when they change a page, but I don't see a way to >>>> enforce that. >>>> >>> > > Adam, > > There is currently a beginning effort to create a CMS for apache.org > (infrastructure/trunk/projects/cms) is yours related to this effort? No, it's not. Based on how much time I've spent already(started my imoprter last friday), and how familiar I am with ofbiz, it'd probably take me a 2 months to get mostly feature compatible with confluence(that's for a single person working in his spare time). > Jacques > PS: Not sure how to access to infrastructure/trunk/projects/cms/README > with the rights I have You mean it's not public? |
The url is here
https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms and is publicly readable. ----- Original Message ---- > From: Adam Heath <[hidden email]> > To: Jacques Le Roux <[hidden email]> > Cc: [hidden email]; [hidden email] > Sent: Tue, September 21, 2010 3:42:35 PM > Subject: Re: ofbiz wiki(confluence) > > On 09/21/2010 02:07 PM, Jacques Le Roux wrote: > > From: "Adam Heath" <[hidden email]> > >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote: > >>> From: "Adam Heath" <[hidden email]> > >>>> So, I need some admin help with cwiki.apache.org, or at least advice. > >>>> I've got a script that uses xmlrpc to confluence, and fetches all > >>>> previous page(+versions), comments, attachments(+versions), tracks > >>>> renames, usernames, and commit messages. I then take all this data and > >>>> convert it into a long series of git commits, with the files layed out > >>>> in a proper webslinger design. The author of each git commit is the > >>>> person who changed the page, added a comment, or uploaded a new > >>>> attachment. > >> > >> This webslinger layout is still in flux, as is my script. The basic > >> logic works, however, by fetching all meta data, storing most of the > >> bulk of that in a temporary cache folder(only for the duration of the > >> script), then sorting each item by date, and replaying the set of > >> changes one by one. > >> > >> It's optimized by storing the 'lastFoo' stuff for each > >> page/comment/attachment/(title->pageId mapping) as needed, so that it > >> can detect newer versions, etc, and not have to do anything. A refresh > >> after a full download against the OFBIZ space takes 2 minutes, with > >> nothing new to fetch. > >> > >>>> The issue I am having is the confluence installed on cwiki is old. > >>>> Newer versions support returning the PageHistorySummary.versionComment > >>>> thru the rpc; currently, I have to fall back and do a screen scrape of > >>>> the viewpreviousversions.action page. > >> > >> CONFDEV docs definately list a versionComment field on > >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki. > >> > >>>> Where should I ask for help on this, getting this new api implemented? > >>> > >>> infra team: [hidden email] > >>> > >>> I put them in copy > >> > >> Thanks. I'm putting more information in this email; I've left > >> dev@ofbiz on the cc for this email, as others might be interested in > >> what I have discovered. > >> > >>>> I also have suggestions to make the api more lightweight, when doing > >>>> incremental updates(which my system supports). > >> > >> Here are the suggestions: > >> > >> I can fetch all attachments for a page. But the attachment data > >> returned doesn't include the current version as a field. I have to > >> split the download url(which is sub-optimal; it includes the current > >> version as a parameter). It might be nice to have an AttachmentSummary > >> type record. > >> > >> What if uploads an attachment, then a new version of the attachment, > >> then changes the page, then deletes the attachment? How could I fetch > >> that information? I don't see a way to fetch all attachments for all > >> time against a particular history. This is also a problem for deleted > >> pages, comments, and labels(probably others). > >> > >> Comments in confluence support editting. Is this history stored, and > >> if so, can I get access to it? > >> > >> Are labels versioned? > >> > >> Children of pages are versioned, only because pages themselves are > >> versioned, which includes the value of the parentId at the time the > >> page was changed. However, the frontend doesn't let you see older > >> children, when looking at previous versions. > >> > >> It'd be nice if when calling getPageHistory, I could request a subset > >> of the list, instead of *all* page versions. If a page has 271 > >> versions, and I have already fetched them, and the current page has a > >> version of 274, then I only really need to fetch 3 PageHistorySummary > >> records(to get the versionComment from newer versions of confluence). > >> > >> BlogEntrySummary doesn't include version, but BlogEntry does. And I > >> can't fetch old versions of blogs. > >> > >> > >>>> > >>>> As a side note, there is a severe lack of version comments. This > >>>> script ends up producing 3117 commits. Some of those are page > >>>> renames/comments/attachments, which don't have a commit message. Most > >>>> are page commits. There are only 70 change messages. It'd be nice if > >>>> people would comment when they change a page, but I don't see a way to > >>>> enforce that. > >>>> > >>> > > > > Adam, > > > > There is currently a beginning effort to create a CMS for apache.org > > (infrastructure/trunk/projects/cms) is yours related to this effort? > > No, it's not. Based on how much time I've spent already(started my > imoprter last friday), and how familiar I am with ofbiz, it'd probably > take me a 2 months to get mostly feature compatible with > confluence(that's for a single person working in his spare time). > > > Jacques > > PS: Not sure how to access to infrastructure/trunk/projects/cms/README > > with the rights I have > > You mean it's not public? > |
Administrator
|
Thanks Joe,
I quickly tried through Subclipse and got an error. I guess now Adam has a better idea of what I was talking about. I mean maybe Webslinger could be used, just my 2 cts... Jacques From: "Joe Schaefer" <[hidden email]> > The url is here > https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms > and is publicly readable. > > > > ----- Original Message ---- >> From: Adam Heath <[hidden email]> >> To: Jacques Le Roux <[hidden email]> >> Cc: [hidden email]; [hidden email] >> Sent: Tue, September 21, 2010 3:42:35 PM >> Subject: Re: ofbiz wiki(confluence) >> >> On 09/21/2010 02:07 PM, Jacques Le Roux wrote: >> > From: "Adam Heath" <[hidden email]> >> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote: >> >>> From: "Adam Heath" <[hidden email]> >> >>>> So, I need some admin help with cwiki.apache.org, or at least advice. >> >>>> I've got a script that uses xmlrpc to confluence, and fetches all >> >>>> previous page(+versions), comments, attachments(+versions), tracks >> >>>> renames, usernames, and commit messages. I then take all this data and >> >>>> convert it into a long series of git commits, with the files layed out >> >>>> in a proper webslinger design. The author of each git commit is the >> >>>> person who changed the page, added a comment, or uploaded a new >> >>>> attachment. >> >> >> >> This webslinger layout is still in flux, as is my script. The basic >> >> logic works, however, by fetching all meta data, storing most of the >> >> bulk of that in a temporary cache folder(only for the duration of the >> >> script), then sorting each item by date, and replaying the set of >> >> changes one by one. >> >> >> >> It's optimized by storing the 'lastFoo' stuff for each >> >> page/comment/attachment/(title->pageId mapping) as needed, so that it >> >> can detect newer versions, etc, and not have to do anything. A refresh >> >> after a full download against the OFBIZ space takes 2 minutes, with >> >> nothing new to fetch. >> >> >> >>>> The issue I am having is the confluence installed on cwiki is old. >> >>>> Newer versions support returning the PageHistorySummary.versionComment >> >>>> thru the rpc; currently, I have to fall back and do a screen scrape of >> >>>> the viewpreviousversions.action page. >> >> >> >> CONFDEV docs definately list a versionComment field on >> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki. >> >> >> >>>> Where should I ask for help on this, getting this new api implemented? >> >>> >> >>> infra team: [hidden email] >> >>> >> >>> I put them in copy >> >> >> >> Thanks. I'm putting more information in this email; I've left >> >> dev@ofbiz on the cc for this email, as others might be interested in >> >> what I have discovered. >> >> >> >>>> I also have suggestions to make the api more lightweight, when doing >> >>>> incremental updates(which my system supports). >> >> >> >> Here are the suggestions: >> >> >> >> I can fetch all attachments for a page. But the attachment data >> >> returned doesn't include the current version as a field. I have to >> >> split the download url(which is sub-optimal; it includes the current >> >> version as a parameter). It might be nice to have an AttachmentSummary >> >> type record. >> >> >> >> What if uploads an attachment, then a new version of the attachment, >> >> then changes the page, then deletes the attachment? How could I fetch >> >> that information? I don't see a way to fetch all attachments for all >> >> time against a particular history. This is also a problem for deleted >> >> pages, comments, and labels(probably others). >> >> >> >> Comments in confluence support editting. Is this history stored, and >> >> if so, can I get access to it? >> >> >> >> Are labels versioned? >> >> >> >> Children of pages are versioned, only because pages themselves are >> >> versioned, which includes the value of the parentId at the time the >> >> page was changed. However, the frontend doesn't let you see older >> >> children, when looking at previous versions. >> >> >> >> It'd be nice if when calling getPageHistory, I could request a subset >> >> of the list, instead of *all* page versions. If a page has 271 >> >> versions, and I have already fetched them, and the current page has a >> >> version of 274, then I only really need to fetch 3 PageHistorySummary >> >> records(to get the versionComment from newer versions of confluence). >> >> >> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I >> >> can't fetch old versions of blogs. >> >> >> >> >> >>>> >> >>>> As a side note, there is a severe lack of version comments. This >> >>>> script ends up producing 3117 commits. Some of those are page >> >>>> renames/comments/attachments, which don't have a commit message. Most >> >>>> are page commits. There are only 70 change messages. It'd be nice if >> >>>> people would comment when they change a page, but I don't see a way to >> >>>> enforce that. >> >>>> >> >>> >> > >> > Adam, >> > >> > There is currently a beginning effort to create a CMS for apache.org >> > (infrastructure/trunk/projects/cms) is yours related to this effort? >> >> No, it's not. Based on how much time I've spent already(started my >> imoprter last friday), and how familiar I am with ofbiz, it'd probably >> take me a 2 months to get mostly feature compatible with >> confluence(that's for a single person working in his spare time). >> >> > Jacques >> > PS: Not sure how to access to infrastructure/trunk/projects/cms/README >> > with the rights I have >> >> You mean it's not public? >> > > > > |
On 09/21/2010 03:53 PM, Jacques Le Roux wrote:
> Thanks Joe, > > I quickly tried through Subclipse and got an error. > I guess now Adam has a better idea of what I was talking about. > I mean maybe Webslinger could be used, just my 2 cts... I will attempt to describe webslinger for those who haven't ever heard of it before. The major features(bullet points) of webslinger-core are: * Content data stored as raw files. This is to allow normal programs, like grep, find, vim, dreamweaver, photoshop, git, svn work without modifications. * Makes use of commons-vfs, and a custom set of layered filesystems. * One layered filesystem is called 'flat'. Arbitrary attributes (FileObject.getContent().getAttribute(name)) are stored as path/to/file@, into separate files. Again, this allows for easy integration with other systems. * Another layered filesystem is called 'wsvfs'. This is an overlay/cow type filesystem, where multiple real filesystems are combined on the fly, giving merged directory listings, with support for up-copy and whiteout. Any point of the tree can 'overlay' any other part of the tree, altho this feature isn't normally nescessary. * Automatic extension resolution. This allows for pretty urls that don't have extensions, and allow the implementation on the server to be changed as nescessary. End-users have problems with extensions, so that is hidden. * Any 'path' can be configured to do it's own sub-path management. This allows for nice urls like /shop/product/$productId/detail and /shop/cart/add/$productId and /Login/Path/To/Protected/Page. These urls then show up nicely in hit reports. They are also easier for end-users to remember. * Automatic attribute inheritance. Extensions are used to find the mime-type of a file. Or the mime-type can be set directly on the file. Then, any attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are inherited for the resource in question. This allows mapping all ${page}.cf to application/x-server-side-confluence, creating an attribute called 'type' with a value of 'confluence-page'. More on the types in a bit. * Every resource has a type, and a handler. Standard types are jsp, cgi, binary. Base types are event(bsf-based), code, template. Type can also be servlet, or, even more advanced(but not ready to be released) is 'vaadin' as a type. * Several languages are integrated: template: freemarker/velocity/text, bsf+code: groovy/janino(java)/jython/rhino/bsh/quercus(php). * Macros called by a template language can be implemented in *any* webslinger resource(any type, any language). Each integrated template type has proxies implemented that allow it to call back into webslinger macros. velocity-> #Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker-> <@Merge path="/path/to-file" template0="/template/to/wrap/it/with"/>. Support for macros with content bodies is fully supported as well. * Support for one-type 'wrapper' of a text output, and then different page styles. Partial-ajax page updates can then skip this, and do smart updates of regions of the browser. The above list is an non-inclusive list of features in webslinger-core. It's really generic, and not tied to any particular implementation. The other major thing different about it, is that webslinger is *itself* a servlet container, just like catalina or glashfish. However, what sets it apart from all others, is that it doesn't run standalone; instead, it is installed into a parent container. It then fakes/wraps everything, to support it's fancy stuff. It supports running standard servlets, but then get backed by commons-vfs, with overlay support, etc. This implementation isn't perfect, and really needs to be improved upon. I've been working on a demo for the ofbiz community to play with. However, the existing embedded site in the repository was rather small, so I wrote an importer to pull stuff from cwiki, which is what then started this thread. ps: the license on all our code is asl 2.0 > > Jacques > > From: "Joe Schaefer" <[hidden email]> >> The url is here >> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms >> and is publicly readable. >> >> >> >> ----- Original Message ---- >>> From: Adam Heath <[hidden email]> >>> To: Jacques Le Roux <[hidden email]> >>> Cc: [hidden email]; [hidden email] >>> Sent: Tue, September 21, 2010 3:42:35 PM >>> Subject: Re: ofbiz wiki(confluence) >>> >>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote: >>> > From: "Adam Heath" <[hidden email]> >>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote: >>> >>> From: "Adam Heath" <[hidden email]> >>> >>>> So, I need some admin help with cwiki.apache.org, or at least >>> advice. >>> >>>> I've got a script that uses xmlrpc to confluence, and fetches all >>> >>>> previous page(+versions), comments, attachments(+versions), tracks >>> >>>> renames, usernames, and commit messages. I then take all this >>> data and >>> >>>> convert it into a long series of git commits, with the files >>> layed out >>> >>>> in a proper webslinger design. The author of each git commit is the >>> >>>> person who changed the page, added a comment, or uploaded a new >>> >>>> attachment. >>> >> >>> >> This webslinger layout is still in flux, as is my script. The basic >>> >> logic works, however, by fetching all meta data, storing most of the >>> >> bulk of that in a temporary cache folder(only for the duration of the >>> >> script), then sorting each item by date, and replaying the set of >>> >> changes one by one. >>> >> >>> >> It's optimized by storing the 'lastFoo' stuff for each >>> >> page/comment/attachment/(title->pageId mapping) as needed, so that it >>> >> can detect newer versions, etc, and not have to do anything. A >>> refresh >>> >> after a full download against the OFBIZ space takes 2 minutes, with >>> >> nothing new to fetch. >>> >> >>> >>>> The issue I am having is the confluence installed on cwiki is old. >>> >>>> Newer versions support returning the >>> PageHistorySummary.versionComment >>> >>>> thru the rpc; currently, I have to fall back and do a screen >>> scrape of >>> >>>> the viewpreviousversions.action page. >>> >> >>> >> CONFDEV docs definately list a versionComment field on >>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki. >>> >> >>> >>>> Where should I ask for help on this, getting this new api >>> implemented? >>> >>> >>> >>> infra team: [hidden email] >>> >>> >>> >>> I put them in copy >>> >> >>> >> Thanks. I'm putting more information in this email; I've left >>> >> dev@ofbiz on the cc for this email, as others might be interested in >>> >> what I have discovered. >>> >> >>> >>>> I also have suggestions to make the api more lightweight, when >>> doing >>> >>>> incremental updates(which my system supports). >>> >> >>> >> Here are the suggestions: >>> >> >>> >> I can fetch all attachments for a page. But the attachment data >>> >> returned doesn't include the current version as a field. I have to >>> >> split the download url(which is sub-optimal; it includes the current >>> >> version as a parameter). It might be nice to have an >>> AttachmentSummary >>> >> type record. >>> >> >>> >> What if uploads an attachment, then a new version of the attachment, >>> >> then changes the page, then deletes the attachment? How could I fetch >>> >> that information? I don't see a way to fetch all attachments for all >>> >> time against a particular history. This is also a problem for deleted >>> >> pages, comments, and labels(probably others). >>> >> >>> >> Comments in confluence support editting. Is this history stored, and >>> >> if so, can I get access to it? >>> >> >>> >> Are labels versioned? >>> >> >>> >> Children of pages are versioned, only because pages themselves are >>> >> versioned, which includes the value of the parentId at the time the >>> >> page was changed. However, the frontend doesn't let you see older >>> >> children, when looking at previous versions. >>> >> >>> >> It'd be nice if when calling getPageHistory, I could request a subset >>> >> of the list, instead of *all* page versions. If a page has 271 >>> >> versions, and I have already fetched them, and the current page has a >>> >> version of 274, then I only really need to fetch 3 PageHistorySummary >>> >> records(to get the versionComment from newer versions of confluence). >>> >> >>> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I >>> >> can't fetch old versions of blogs. >>> >> >>> >> >>> >>>> >>> >>>> As a side note, there is a severe lack of version comments. This >>> >>>> script ends up producing 3117 commits. Some of those are page >>> >>>> renames/comments/attachments, which don't have a commit message. >>> Most >>> >>>> are page commits. There are only 70 change messages. It'd be >>> nice if >>> >>>> people would comment when they change a page, but I don't see a >>> way to >>> >>>> enforce that. >>> >>>> >>> >>> >>> > >>> > Adam, >>> > >>> > There is currently a beginning effort to create a CMS for apache.org >>> > (infrastructure/trunk/projects/cms) is yours related to this effort? >>> >>> No, it's not. Based on how much time I've spent already(started my >>> imoprter last friday), and how familiar I am with ofbiz, it'd >>> probably take me a 2 months to get mostly feature compatible with >>> confluence(that's for a single person working in his spare time). >>> >>> > Jacques >>> > PS: Not sure how to access to infrastructure/trunk/projects/cms/README >>> > with the rights I have >>> >>> You mean it's not public? >>> >> >> >> > |
Sounds interesting, but for us we require static exports.
Since you're using flat files that might not be all that hard for you to implement. Confluence as a CMS has an interesting future ahead of it at the ASF. Right now we have a hard dependency on the auto-export plugin, whose support characteristics prevent us from running the latest versions of confluence. If the situation doesn't change over the next few months, we'll likely just phase out the CMS aspects of confluence and replace it with something that natively supports static exports. ----- Original Message ---- > From: Adam Heath <[hidden email]> > To: Jacques Le Roux <[hidden email]> > Cc: Joe Schaefer <[hidden email]>; [hidden email]; >[hidden email] > Sent: Tue, September 21, 2010 5:34:10 PM > Subject: Re: ofbiz wiki(confluence) > > On 09/21/2010 03:53 PM, Jacques Le Roux wrote: > > Thanks Joe, > > > > I quickly tried through Subclipse and got an error. > > I guess now Adam has a better idea of what I was talking about. > > I mean maybe Webslinger could be used, just my 2 cts... > > I will attempt to describe webslinger for those who haven't ever heard of it >before. > > The major features(bullet points) of webslinger-core are: > > * Content data stored as raw files. This is to allow normal programs, like >grep, find, vim, dreamweaver, photoshop, git, svn work without modifications. > > * Makes use of commons-vfs, and a custom set of layered filesystems. > > * One layered filesystem is called 'flat'. Arbitrary attributes >(FileObject.getContent().getAttribute(name)) are stored as path/to/file@, into >separate files. Again, this allows for easy integration with other systems. > > * Another layered filesystem is called 'wsvfs'. This is an overlay/cow type >filesystem, where multiple real filesystems are combined on the fly, giving >merged directory listings, with support for up-copy and whiteout. Any point of >the tree can 'overlay' any other part of the tree, altho this feature isn't >normally nescessary. > > * Automatic extension resolution. This allows for pretty urls that don't have >extensions, and allow the implementation on the server to be changed as >nescessary. End-users have problems with extensions, so that is hidden. > > * Any 'path' can be configured to do it's own sub-path management. This allows >for nice urls like /shop/product/$productId/detail and >/shop/cart/add/$productId and /Login/Path/To/Protected/Page. These urls then >show up nicely in hit reports. They are also easier for end-users to remember. > > * Automatic attribute inheritance. Extensions are used to find the mime-type >of a file. Or the mime-type can be set directly on the file. Then, any >attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are inherited >for the resource in question. This allows mapping all ${page}.cf to >application/x-server-side-confluence, creating an attribute called 'type' with >a value of 'confluence-page'. More on the types in a bit. > > * Every resource has a type, and a handler. Standard types are jsp, cgi, >binary. Base types are event(bsf-based), code, template. Type can also be >servlet, or, even more advanced(but not ready to be released) is 'vaadin' as a >type. > > * Several languages are integrated: template: freemarker/velocity/text, >bsf+code: groovy/janino(java)/jython/rhino/bsh/quercus(php). > > * Macros called by a template language can be implemented in *any* webslinger >resource(any type, any language). Each integrated template type has proxies >implemented that allow it to call back into webslinger macros. velocity-> >#Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker-> <@Merge >path="/path/to-file" template0="/template/to/wrap/it/with"/>. Support for >macros with content bodies is fully supported as well. > > * Support for one-type 'wrapper' of a text output, and then different page >styles. Partial-ajax page updates can then skip this, and do smart updates of >regions of the browser. > > The above list is an non-inclusive list of features in webslinger-core. It's >really generic, and not tied to any particular implementation. > > The other major thing different about it, is that webslinger is *itself* a >servlet container, just like catalina or glashfish. However, what sets it apart >from all others, is that it doesn't run standalone; instead, it is installed >into a parent container. It then fakes/wraps everything, to support it's fancy >stuff. It supports running standard servlets, but then get backed by >commons-vfs, with overlay support, etc. This implementation isn't perfect, and >really needs to be improved upon. > > I've been working on a demo for the ofbiz community to play with. However, the >existing embedded site in the repository was rather small, so I wrote an >importer to pull stuff from cwiki, which is what then started this thread. > > ps: the license on all our code is asl 2.0 > > > > > Jacques > > > > From: "Joe Schaefer" <[hidden email]> > >> The url is here > >> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms > >> and is publicly readable. > >> > >> > >> > >> ----- Original Message ---- > >>> From: Adam Heath <[hidden email]> > >>> To: Jacques Le Roux <[hidden email]> > >>> Cc: [hidden email]; [hidden email] > >>> Sent: Tue, September 21, 2010 3:42:35 PM > >>> Subject: Re: ofbiz wiki(confluence) > >>> > >>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote: > >>> > From: "Adam Heath" <[hidden email]> > >>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote: > >>> >>> From: "Adam Heath" <[hidden email]> > >>> >>>> So, I need some admin help with cwiki.apache.org, or at least > >>> advice. > >>> >>>> I've got a script that uses xmlrpc to confluence, and fetches all > >>> >>>> previous page(+versions), comments, attachments(+versions), tracks > >>> >>>> renames, usernames, and commit messages. I then take all this > >>> data and > >>> >>>> convert it into a long series of git commits, with the files > >>> layed out > >>> >>>> in a proper webslinger design. The author of each git commit is the > >>> >>>> person who changed the page, added a comment, or uploaded a new > >>> >>>> attachment. > >>> >> > >>> >> This webslinger layout is still in flux, as is my script. The basic > >>> >> logic works, however, by fetching all meta data, storing most of the > >>> >> bulk of that in a temporary cache folder(only for the duration of the > >>> >> script), then sorting each item by date, and replaying the set of > >>> >> changes one by one. > >>> >> > >>> >> It's optimized by storing the 'lastFoo' stuff for each > >>> >> page/comment/attachment/(title->pageId mapping) as needed, so that it > >>> >> can detect newer versions, etc, and not have to do anything. A > >>> refresh > >>> >> after a full download against the OFBIZ space takes 2 minutes, with > >>> >> nothing new to fetch. > >>> >> > >>> >>>> The issue I am having is the confluence installed on cwiki is old. > >>> >>>> Newer versions support returning the > >>> PageHistorySummary.versionComment > >>> >>>> thru the rpc; currently, I have to fall back and do a screen > >>> scrape of > >>> >>>> the viewpreviousversions.action page. > >>> >> > >>> >> CONFDEV docs definately list a versionComment field on > >>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki. > >>> >> > >>> >>>> Where should I ask for help on this, getting this new api > >>> implemented? > >>> >>> > >>> >>> infra team: [hidden email] > >>> >>> > >>> >>> I put them in copy > >>> >> > >>> >> Thanks. I'm putting more information in this email; I've left > >>> >> dev@ofbiz on the cc for this email, as others might be interested in > >>> >> what I have discovered. > >>> >> > >>> >>>> I also have suggestions to make the api more lightweight, when > >>> doing > >>> >>>> incremental updates(which my system supports). > >>> >> > >>> >> Here are the suggestions: > >>> >> > >>> >> I can fetch all attachments for a page. But the attachment data > >>> >> returned doesn't include the current version as a field. I have to > >>> >> split the download url(which is sub-optimal; it includes the current > >>> >> version as a parameter). It might be nice to have an > >>> AttachmentSummary > >>> >> type record. > >>> >> > >>> >> What if uploads an attachment, then a new version of the attachment, > >>> >> then changes the page, then deletes the attachment? How could I fetch > >>> >> that information? I don't see a way to fetch all attachments for all > >>> >> time against a particular history. This is also a problem for deleted > >>> >> pages, comments, and labels(probably others). > >>> >> > >>> >> Comments in confluence support editting. Is this history stored, and > >>> >> if so, can I get access to it? > >>> >> > >>> >> Are labels versioned? > >>> >> > >>> >> Children of pages are versioned, only because pages themselves are > >>> >> versioned, which includes the value of the parentId at the time the > >>> >> page was changed. However, the frontend doesn't let you see older > >>> >> children, when looking at previous versions. > >>> >> > >>> >> It'd be nice if when calling getPageHistory, I could request a subset > >>> >> of the list, instead of *all* page versions. If a page has 271 > >>> >> versions, and I have already fetched them, and the current page has a > >>> >> version of 274, then I only really need to fetch 3 PageHistorySummary > >>> >> records(to get the versionComment from newer versions of confluence). > >>> >> > >>> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I > >>> >> can't fetch old versions of blogs. > >>> >> > >>> >> > >>> >>>> > >>> >>>> As a side note, there is a severe lack of version comments. This > >>> >>>> script ends up producing 3117 commits. Some of those are page > >>> >>>> renames/comments/attachments, which don't have a commit message. > >>> Most > >>> >>>> are page commits. There are only 70 change messages. It'd be > >>> nice if > >>> >>>> people would comment when they change a page, but I don't see a > >>> way to > >>> >>>> enforce that. > >>> >>>> > >>> >>> > >>> > > >>> > Adam, > >>> > > >>> > There is currently a beginning effort to create a CMS for apache.org > >>> > (infrastructure/trunk/projects/cms) is yours related to this effort? > >>> > >>> No, it's not. Based on how much time I've spent already(started my > >>> imoprter last friday), and how familiar I am with ofbiz, it'd > >>> probably take me a 2 months to get mostly feature compatible with > >>> confluence(that's for a single person working in his spare time). > >>> > >>> > Jacques > >>> > PS: Not sure how to access to infrastructure/trunk/projects/cms/README > >>> > with the rights I have > >>> > >>> You mean it's not public? > >>> > >> > >> > >> > > > > |
On 09/21/2010 04:41 PM, Joe Schaefer wrote:
> Sounds interesting, but for us we require static exports. > Since you're using flat files that might not be all that > hard for you to implement. Why? What kind of load do you have? We(brainfood) have survied slashdotting, without resorting to fancy frontends like varnish. It's been written to be nonblocking(no synchronized keywords), use weak/soft references, and not create sessions until absolutely nescessary. > Confluence as a CMS has an interesting future ahead of it > at the ASF. Right now we have a hard dependency on the > auto-export plugin, whose support characteristics prevent > us from running the latest versions of confluence. If > the situation doesn't change over the next few months, > we'll likely just phase out the CMS aspects of confluence > and replace it with something that natively supports > static exports. We would like to support static exports too, and it might even be possible, with little effort. However, it's just not been nescessary for us, as we've never had a problem with any kind of load whatsoever. > ----- Original Message ---- >> From: Adam Heath<[hidden email]> >> To: Jacques Le Roux<[hidden email]> >> Cc: Joe Schaefer<[hidden email]>; [hidden email]; >> [hidden email] >> Sent: Tue, September 21, 2010 5:34:10 PM >> Subject: Re: ofbiz wiki(confluence) >> >> On 09/21/2010 03:53 PM, Jacques Le Roux wrote: >>> Thanks Joe, >>> >>> I quickly tried through Subclipse and got an error. >>> I guess now Adam has a better idea of what I was talking about. >>> I mean maybe Webslinger could be used, just my 2 cts... >> >> I will attempt to describe webslinger for those who haven't ever heard of it >> before. >> >> The major features(bullet points) of webslinger-core are: >> >> * Content data stored as raw files. This is to allow normal programs, like >> grep, find, vim, dreamweaver, photoshop, git, svn work without modifications. >> >> * Makes use of commons-vfs, and a custom set of layered filesystems. >> >> * One layered filesystem is called 'flat'. Arbitrary attributes >> (FileObject.getContent().getAttribute(name)) are stored as path/to/file@, into >> separate files. Again, this allows for easy integration with other systems. >> >> * Another layered filesystem is called 'wsvfs'. This is an overlay/cow type >> filesystem, where multiple real filesystems are combined on the fly, giving >> merged directory listings, with support for up-copy and whiteout. Any point of >> the tree can 'overlay' any other part of the tree, altho this feature isn't >> normally nescessary. >> >> * Automatic extension resolution. This allows for pretty urls that don't have >> extensions, and allow the implementation on the server to be changed as >> nescessary. End-users have problems with extensions, so that is hidden. >> >> * Any 'path' can be configured to do it's own sub-path management. This allows >> for nice urls like /shop/product/$productId/detail and >> /shop/cart/add/$productId and /Login/Path/To/Protected/Page. These urls then >> show up nicely in hit reports. They are also easier for end-users to remember. >> >> * Automatic attribute inheritance. Extensions are used to find the mime-type >> of a file. Or the mime-type can be set directly on the file. Then, any >> attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are inherited >> for the resource in question. This allows mapping all ${page}.cf to >> application/x-server-side-confluence, creating an attribute called 'type' with >> a value of 'confluence-page'. More on the types in a bit. >> >> * Every resource has a type, and a handler. Standard types are jsp, cgi, >> binary. Base types are event(bsf-based), code, template. Type can also be >> servlet, or, even more advanced(but not ready to be released) is 'vaadin' as a >> type. >> >> * Several languages are integrated: template: freemarker/velocity/text, >> bsf+code: groovy/janino(java)/jython/rhino/bsh/quercus(php). >> >> * Macros called by a template language can be implemented in *any* webslinger >> resource(any type, any language). Each integrated template type has proxies >> implemented that allow it to call back into webslinger macros. velocity-> >> #Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker-> <@Merge >> path="/path/to-file" template0="/template/to/wrap/it/with"/>. Support for >> macros with content bodies is fully supported as well. >> >> * Support for one-type 'wrapper' of a text output, and then different page >> styles. Partial-ajax page updates can then skip this, and do smart updates of >> regions of the browser. >> >> The above list is an non-inclusive list of features in webslinger-core. It's >> really generic, and not tied to any particular implementation. >> >> The other major thing different about it, is that webslinger is *itself* a >> servlet container, just like catalina or glashfish. However, what sets it apart >>from all others, is that it doesn't run standalone; instead, it is installed >> into a parent container. It then fakes/wraps everything, to support it's fancy >> stuff. It supports running standard servlets, but then get backed by >> commons-vfs, with overlay support, etc. This implementation isn't perfect, and >> really needs to be improved upon. >> >> I've been working on a demo for the ofbiz community to play with. However, the >> existing embedded site in the repository was rather small, so I wrote an >> importer to pull stuff from cwiki, which is what then started this thread. >> >> ps: the license on all our code is asl 2.0 >> >>> >>> Jacques >>> >>> From: "Joe Schaefer"<[hidden email]> >>>> The url is here >>>> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms >>>> and is publicly readable. >>>> >>>> >>>> >>>> ----- Original Message ---- >>>>> From: Adam Heath<[hidden email]> >>>>> To: Jacques Le Roux<[hidden email]> >>>>> Cc: [hidden email]; [hidden email] >>>>> Sent: Tue, September 21, 2010 3:42:35 PM >>>>> Subject: Re: ofbiz wiki(confluence) >>>>> >>>>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote: >>>>>> From: "Adam Heath"<[hidden email]> >>>>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote: >>>>> >>> From: "Adam Heath"<[hidden email]> >>>>> >>>> So, I need some admin help with cwiki.apache.org, or at least >>>>> advice. >>>>>>>>> I've got a script that uses xmlrpc to confluence, and fetches all >>>>>>>>> previous page(+versions), comments, attachments(+versions), tracks >>>>> >>>> renames, usernames, and commit messages. I then take all this >>>>> data and >>>>>>>>> convert it into a long series of git commits, with the files >>>>> layed out >>>>>>>>> in a proper webslinger design. The author of each git commit is the >>>>>>>>> person who changed the page, added a comment, or uploaded a new >>>>>>>>> attachment. >>>>>>> >>>>>>> This webslinger layout is still in flux, as is my script. The basic >>>>>>> logic works, however, by fetching all meta data, storing most of the >>>>>>> bulk of that in a temporary cache folder(only for the duration of the >>>>>>> script), then sorting each item by date, and replaying the set of >>>>>>> changes one by one. >>>>>>> >>>>>>> It's optimized by storing the 'lastFoo' stuff for each >>>>>>> page/comment/attachment/(title->pageId mapping) as needed, so that it >>>>>>> can detect newer versions, etc, and not have to do anything. A >>>>> refresh >>>>>>> after a full download against the OFBIZ space takes 2 minutes, with >>>>>>> nothing new to fetch. >>>>>>> >>>>>>>>> The issue I am having is the confluence installed on cwiki is old. >>>>>>>>> Newer versions support returning the >>>>> PageHistorySummary.versionComment >>>>> >>>> thru the rpc; currently, I have to fall back and do a screen >>>>> scrape of >>>>>>>>> the viewpreviousversions.action page. >>>>>>> >>>>> >> CONFDEV docs definately list a versionComment field on >>>>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on cwiki. >>>>>>> >>>>>>>>> Where should I ask for help on this, getting this new api >>>>> implemented? >>>>>>>> >>>>>>>> infra team: [hidden email] >>>>> >>> >>>>>>>> I put them in copy >>>>> >> >>>>>>> Thanks. I'm putting more information in this email; I've left >>>>>>> dev@ofbiz on the cc for this email, as others might be interested in >>>>>>> what I have discovered. >>>>>>> >>>>>>>>> I also have suggestions to make the api more lightweight, when >>>>> doing >>>>>>>>> incremental updates(which my system supports). >>>>>>> >>>>>>> Here are the suggestions: >>>>>>> >>>>>>> I can fetch all attachments for a page. But the attachment data >>>>>>> returned doesn't include the current version as a field. I have to >>>>>>> split the download url(which is sub-optimal; it includes the current >>>>>>> version as a parameter). It might be nice to have an >>>>> AttachmentSummary >>>>>>> type record. >>>>>>> >>>>>>> What if uploads an attachment, then a new version of the attachment, >>>>>>> then changes the page, then deletes the attachment? How could I fetch >>>>>>> that information? I don't see a way to fetch all attachments for all >>>>>>> time against a particular history. This is also a problem for deleted >>>>>>> pages, comments, and labels(probably others). >>>>>>> >>>>>>> Comments in confluence support editting. Is this history stored, and >>>>>>> if so, can I get access to it? >>>>> >> >>>>>>> Are labels versioned? >>>>> >> >>>>>>> Children of pages are versioned, only because pages themselves are >>>>>>> versioned, which includes the value of the parentId at the time the >>>>>>> page was changed. However, the frontend doesn't let you see older >>>>>>> children, when looking at previous versions. >>>>> >> >>>>>>> It'd be nice if when calling getPageHistory, I could request a subset >>>>>>> of the list, instead of *all* page versions. If a page has 271 >>>>>>> versions, and I have already fetched them, and the current page has a >>>>>>> version of 274, then I only really need to fetch 3 PageHistorySummary >>>>>>> records(to get the versionComment from newer versions of confluence). >>>>>>> >>>>> >> BlogEntrySummary doesn't include version, but BlogEntry does. And I >>>>>>> can't fetch old versions of blogs. >>>>> >> >>>>>>> >>>>> >>>> >>>>>>>>> As a side note, there is a severe lack of version comments. This >>>>>>>>> script ends up producing 3117 commits. Some of those are page >>>>> >>>> renames/comments/attachments, which don't have a commit message. >>>>> Most >>>>>>>>> are page commits. There are only 70 change messages. It'd be >>>>> nice if >>>>>>>>> people would comment when they change a page, but I don't see a >>>>> way to >>>>>>>>> enforce that. >>>>>>>>> >>>>> >>> >>>>>> >>>>>> Adam, >>>>> > >>>>>> There is currently a beginning effort to create a CMS for apache.org >>>>> > (infrastructure/trunk/projects/cms) is yours related to this effort? >>>>> >>>>> No, it's not. Based on how much time I've spent already(started my >>>>> imoprter last friday), and how familiar I am with ofbiz, it'd >>>>> probably take me a 2 months to get mostly feature compatible with >>>>> confluence(that's for a single person working in his spare time). >>>>> >>>>>> Jacques >>>>>> PS: Not sure how to access to infrastructure/trunk/projects/cms/README >>>>>> with the rights I have >>>>> >>>>> You mean it's not public? >>>>> >>>> >>>> >>>> >>> >> >> > > > |
About 10M hits a day. No java app hosted on
a single machine would survive for 5 minutes with our load. ----- Original Message ---- > From: Adam Heath <[hidden email]> > To: Joe Schaefer <[hidden email]> > Cc: Jacques Le Roux <[hidden email]>; [hidden email]; >[hidden email] > Sent: Tue, September 21, 2010 5:48:26 PM > Subject: Re: ofbiz wiki(confluence) > > On 09/21/2010 04:41 PM, Joe Schaefer wrote: > > Sounds interesting, but for us we require static exports. > > Since you're using flat files that might not be all that > > hard for you to implement. > > Why? What kind of load do you have? We(brainfood) have survied > slashdotting, without resorting to fancy frontends like varnish. It's > been written to be nonblocking(no synchronized keywords), use > weak/soft references, and not create sessions until absolutely nescessary. > > > Confluence as a CMS has an interesting future ahead of it > > at the ASF. Right now we have a hard dependency on the > > auto-export plugin, whose support characteristics prevent > > us from running the latest versions of confluence. If > > the situation doesn't change over the next few months, > > we'll likely just phase out the CMS aspects of confluence > > and replace it with something that natively supports > > static exports. > > We would like to support static exports too, and it might even be > possible, with little effort. However, it's just not been nescessary > for us, as we've never had a problem with any kind of load whatsoever. > > > ----- Original Message ---- > >> From: Adam Heath<[hidden email]> > >> To: Jacques Le Roux<[hidden email]> > >> Cc: Joe Schaefer<[hidden email]>; [hidden email]; > >> [hidden email] > >> Sent: Tue, September 21, 2010 5:34:10 PM > >> Subject: Re: ofbiz wiki(confluence) > >> > >> On 09/21/2010 03:53 PM, Jacques Le Roux wrote: > >>> Thanks Joe, > >>> > >>> I quickly tried through Subclipse and got an error. > >>> I guess now Adam has a better idea of what I was talking about. > >>> I mean maybe Webslinger could be used, just my 2 cts... > >> > >> I will attempt to describe webslinger for those who haven't ever heard of >it > >> before. > >> > >> The major features(bullet points) of webslinger-core are: > >> > >> * Content data stored as raw files. This is to allow normal programs, >like > >> grep, find, vim, dreamweaver, photoshop, git, svn work without >modifications. > >> > >> * Makes use of commons-vfs, and a custom set of layered filesystems. > >> > >> * One layered filesystem is called 'flat'. Arbitrary attributes > >> (FileObject.getContent().getAttribute(name)) are stored as path/to/file@, >into > >> separate files. Again, this allows for easy integration with other >systems. > >> > >> * Another layered filesystem is called 'wsvfs'. This is an overlay/cow >type > >> filesystem, where multiple real filesystems are combined on the fly, >giving > >> merged directory listings, with support for up-copy and whiteout. Any >point of > >> the tree can 'overlay' any other part of the tree, altho this feature >isn't > >> normally nescessary. > >> > >> * Automatic extension resolution. This allows for pretty urls that don't >have > >> extensions, and allow the implementation on the server to be changed as > >> nescessary. End-users have problems with extensions, so that is hidden. > >> > >> * Any 'path' can be configured to do it's own sub-path management. This >allows > >> for nice urls like /shop/product/$productId/detail and > >> /shop/cart/add/$productId and /Login/Path/To/Protected/Page. These urls >then > >> show up nicely in hit reports. They are also easier for end-users to >remember. > >> > >> * Automatic attribute inheritance. Extensions are used to find the >mime-type > >> of a file. Or the mime-type can be set directly on the file. Then, any > >> attribute files set in /WEB-INF/DefaultMimeAttributes/$mime/$type are >inherited > >> for the resource in question. This allows mapping all ${page}.cf to > >> application/x-server-side-confluence, creating an attribute called 'type' >with > >> a value of 'confluence-page'. More on the types in a bit. > >> > >> * Every resource has a type, and a handler. Standard types are jsp, > >> binary. Base types are event(bsf-based), code, template. Type can also >be > >> servlet, or, even more advanced(but not ready to be released) is 'vaadin' >as a > >> type. > >> > >> * Several languages are integrated: template: freemarker/velocity/text, > >> bsf+code: groovy/janino(java)/jython/rhino/bsh/quercus(php). > >> > >> * Macros called by a template language can be implemented in *any* >webslinger > >> resource(any type, any language). Each integrated template type has >proxies > >> implemented that allow it to call back into webslinger macros. > >> #Merge("/path/to/file", "/template/to/wrap/it/with"), freemarker-> ><@Merge > >> path="/path/to-file" template0="/template/to/wrap/it/with"/>. Support for > >> macros with content bodies is fully supported as well. > >> > >> * Support for one-type 'wrapper' of a text output, and then different >page > >> styles. Partial-ajax page updates can then skip this, and do smart >updates of > >> regions of the browser. > >> > >> The above list is an non-inclusive list of features in webslinger-core. >It's > >> really generic, and not tied to any particular implementation. > >> > >> The other major thing different about it, is that webslinger is *itself* >a > >> servlet container, just like catalina or glashfish. However, what sets it >apart > >>from all others, is that it doesn't run standalone; instead, it is >installed > >> into a parent container. It then fakes/wraps everything, to support it's >fancy > >> stuff. It supports running standard servlets, but then get backed by > >> commons-vfs, with overlay support, etc. This implementation isn't >perfect, and > >> really needs to be improved upon. > >> > >> I've been working on a demo for the ofbiz community to play with. However, >the > >> existing embedded site in the repository was rather small, so I wrote an > >> importer to pull stuff from cwiki, which is what then started this >thread. > >> > >> ps: the license on all our code is asl 2.0 > >> > >>> > >>> Jacques > >>> > >>> From: "Joe Schaefer"<[hidden email]> > >>>> The url is here > >>>> https://svn.apache.org/repos/infra/infrastructure/trunk/projects/cms > >>>> and is publicly readable. > >>>> > >>>> > >>>> > >>>> ----- Original Message ---- > >>>>> From: Adam Heath<[hidden email]> > >>>>> To: Jacques Le Roux<[hidden email]> > >>>>> Cc: [hidden email]; [hidden email] > >>>>> Sent: Tue, September 21, 2010 3:42:35 PM > >>>>> Subject: Re: ofbiz wiki(confluence) > >>>>> > >>>>> On 09/21/2010 02:07 PM, Jacques Le Roux wrote: > >>>>>> From: "Adam Heath"<[hidden email]> > >>>>> >> On 09/21/2010 11:53 AM, Jacques Le Roux wrote: > >>>>> >>> From: "Adam Heath"<[hidden email]> > >>>>> >>>> So, I need some admin help with cwiki.apache.org, or at least > >>>>> advice. > >>>>>>>>> I've got a script that uses xmlrpc to confluence, and fetches all > >>>>>>>>> previous page(+versions), comments, attachments(+versions), tracks > >>>>> >>>> renames, usernames, and commit messages. I then take all this > >>>>> data and > >>>>>>>>> convert it into a long series of git commits, with the files > >>>>> layed out > >>>>>>>>> in a proper webslinger design. The author of each git commit is > >>>>>>>>> person who changed the page, added a comment, or uploaded a new > >>>>>>>>> attachment. > >>>>>>> > >>>>>>> This webslinger layout is still in flux, as is my script. The basic > >>>>>>> logic works, however, by fetching all meta data, storing most of >the > >>>>>>> bulk of that in a temporary cache folder(only for the duration of >the > >>>>>>> script), then sorting each item by date, and replaying the set of > >>>>>>> changes one by one. > >>>>>>> > >>>>>>> It's optimized by storing the 'lastFoo' stuff for each > >>>>>>> page/comment/attachment/(title->pageId mapping) as needed, so that >it > >>>>>>> can detect newer versions, etc, and not have to do anything. A > >>>>> refresh > >>>>>>> after a full download against the OFBIZ space takes 2 minutes, with > >>>>>>> nothing new to fetch. > >>>>>>> > >>>>>>>>> The issue I am having is the confluence installed on cwiki is >old. > >>>>>>>>> Newer versions support returning the > >>>>> PageHistorySummary.versionComment > >>>>> >>>> thru the rpc; currently, I have to fall back and do a screen > >>>>> scrape of > >>>>>>>>> the viewpreviousversions.action page. > >>>>>>> > >>>>> >> CONFDEV docs definately list a versionComment field on > >>>>> >> PageHistorySummary, that is not exposed in 3.2.0 installed on >cwiki. > >>>>>>> > >>>>>>>>> Where should I ask for help on this, getting this new api > >>>>> implemented? > >>>>>>>> > >>>>>>>> infra team: [hidden email] > >>>>> >>> > >>>>>>>> I put them in copy > >>>>> >> > >>>>>>> Thanks. I'm putting more information in this email; I've left > >>>>>>> dev@ofbiz on the cc for this email, as others might be interested in > >>>>>>> what I have discovered. > >>>>>>> > >>>>>>>>> I also have suggestions to make the api more lightweight, when > >>>>> doing > >>>>>>>>> incremental updates(which my system supports). > >>>>>>> > >>>>>>> Here are the suggestions: > >>>>>>> > >>>>>>> I can fetch all attachments for a page. But the attachment data > >>>>>>> returned doesn't include the current version as a field. I have to > >>>>>>> split the download url(which is sub-optimal; it includes the > >>>>>>> version as a parameter). It might be nice to have an > >>>>> AttachmentSummary > >>>>>>> type record. > >>>>>>> > >>>>>>> What if uploads an attachment, then a new version of the attachment, > >>>>>>> then changes the page, then deletes the attachment? How could I >fetch > >>>>>>> that information? I don't see a way to fetch all attachments for all > >>>>>>> time against a particular history. This is also a problem for deleted > >>>>>>> pages, comments, and labels(probably others). > >>>>>>> > >>>>>>> Comments in confluence support editting. Is this history stored, >and > >>>>>>> if so, can I get access to it? > >>>>> >> > >>>>>>> Are labels versioned? > >>>>> >> > >>>>>>> Children of pages are versioned, only because pages themselves are > >>>>>>> versioned, which includes the value of the parentId at the time the > >>>>>>> page was changed. However, the frontend doesn't let you see older > >>>>>>> children, when looking at previous versions. > >>>>> >> > >>>>>>> It'd be nice if when calling getPageHistory, I could request a > >>>>>>> of the list, instead of *all* page versions. If a page has 271 > >>>>>>> versions, and I have already fetched them, and the current page has a > >>>>>>> version of 274, then I only really need to fetch 3 >PageHistorySummary > >>>>>>> records(to get the versionComment from newer versions of >confluence). > >>>>>>> > >>>>> >> BlogEntrySummary doesn't include version, but BlogEntry does. >And I > >>>>>>> can't fetch old versions of blogs. > >>>>> >> > >>>>>>> > >>>>> >>>> > >>>>>>>>> As a side note, there is a severe lack of version comments. This > >>>>>>>>> script ends up producing 3117 commits. Some of those are page > >>>>> >>>> renames/comments/attachments, which don't have a commit >message. > >>>>> Most > >>>>>>>>> are page commits. There are only 70 change messages. It'd be > >>>>> nice if > >>>>>>>>> people would comment when they change a page, but I don't see a > >>>>> way to > >>>>>>>>> enforce that. > >>>>>>>>> > >>>>> >>> > >>>>>> > >>>>>> Adam, > >>>>> > > >>>>>> There is currently a beginning effort to create a CMS for apache.org > >>>>> > (infrastructure/trunk/projects/cms) is yours related to this >effort? > >>>>> > >>>>> No, it's not. Based on how much time I've spent already(started my > >>>>> imoprter last friday), and how familiar I am with ofbiz, it'd > >>>>> probably take me a 2 months to get mostly feature compatible with > >>>>> confluence(that's for a single person working in his spare time). > >>>>> > >>>>>> Jacques > >>>>>> PS: Not sure how to access to > >>>>>> with the rights I have > >>>>> > >>>>> You mean it's not public? > >>>>> > >>>> > >>>> > >>>> > >>> > >> > >> > > > > > > > > |
On 09/21/2010 04:50 PM, Joe Schaefer wrote:
> About 10M hits a day. No java app hosted on > a single machine would survive for 5 minutes > with our load. Are those just page requests(html), or everything(images+css+other files)? |
I don't recall the breakdown, the 10M figure
counts total daily traffic. You could look at Vadim's stats for more details- the bottom line is that no app that doesn't support static exports will function as a suitable CMS for Apache, which is why we're rolling our own. ----- Original Message ---- > From: Adam Heath <[hidden email]> > To: Joe Schaefer <[hidden email]> > Cc: Jacques Le Roux <[hidden email]>; [hidden email]; >[hidden email] > Sent: Tue, September 21, 2010 6:11:53 PM > Subject: Re: ofbiz wiki(confluence) > > On 09/21/2010 04:50 PM, Joe Schaefer wrote: > > About 10M hits a day. No java app hosted on > > a single machine would survive for 5 minutes > > with our load. > > Are those just page requests(html), or everything(images+css+other files)? > |
On 09/21/2010 05:15 PM, Joe Schaefer wrote:
> I don't recall the breakdown, the 10M figure > counts total daily traffic. You could look > at Vadim's stats for more details- the bottom > line is that no app that doesn't support static > exports will function as a suitable CMS for > Apache, which is why we're rolling our own. Ok, I've looked. 10M seems to be *all* requests. That number doesn't scare me. I've got a site live right now, http://www.hailmerry.com/, that is reporting 190req/s with ab(from apache http) while on localhost. That's without any fancy supercache sitting in front. This site supports online live editting of content. Anonymous users have no session, and we make a point of reducing database access for hot-points. This is running on a single large shared iscsi disk host. The cpu node is an 8-way 2.33Ghz cpu, using xen, with the domU given 512M and a single cpu. The hosting framework is no where near what could be consider super-fast. The main feature, that I haven't mentioned before now, that helps with this, is a thing we have designed called TTLObject. It is designed to protect method calls, by saving their results, and returning old values for a certain amount of time. It is non-blocking, uses a state engine internally. It was overhauled to follow the design patterns in Java Concurrency in Practice. Again, I'm not afraid of those numbers. Our cms stuff(which isn't quite up to the same level of feature integration as confluence), stores most data as raw files(to allow svn/git tracking of history). The database is not used for most functions. |
On 09/21/2010 05:45 PM, Adam Heath wrote:
> On 09/21/2010 05:15 PM, Joe Schaefer wrote: >> I don't recall the breakdown, the 10M figure >> counts total daily traffic. You could look >> at Vadim's stats for more details- the bottom >> line is that no app that doesn't support static >> exports will function as a suitable CMS for >> Apache, which is why we're rolling our own. > > Ok, I've looked. 10M seems to be *all* requests. That number doesn't > scare me. I've got a site live right now, http://www.hailmerry.com/, > that is reporting 190req/s with ab(from apache http) while on localhost. > That's without any fancy supercache sitting in front. This site supports > online live editting of content. Anonymous users have no session, and we > make a point of reducing database access for hot-points. > > This is running on a single large shared iscsi disk host. The cpu node > is an 8-way 2.33Ghz cpu, using xen, with the domU given 512M and a > single cpu. The hosting framework is no where near what could be > consider super-fast. > > The main feature, that I haven't mentioned before now, that helps with > this, is a thing we have designed called TTLObject. It is designed to > protect method calls, by saving their results, and returning old values > for a certain amount of time. It is non-blocking, uses a state engine > internally. It was overhauled to follow the design patterns in Java > Concurrency in Practice. > > Again, I'm not afraid of those numbers. Our cms stuff(which isn't quite > up to the same level of feature integration as confluence), stores most > data as raw files(to allow svn/git tracking of history). The database is > not used for most functions. I'm still trying to understand the distribution of load. If my understanding is wrong, then please tell me. It'll give me a target to strive for. I've looked at more graphs at (1), and I see a breakdown of page views per major sub-site(host). Our current framework underwent a major rewrite 4 years ago, and it was first deployed into a production state 3.5 years ago. That particular site did 3,500,000 total requests the day after it went live, 500,000 page views. There was no super-cache in front of it. The version of the software at the time had a bug, where If-Modified-Since processing didn't work, so all images were always fetched by clients. Our filesystem code has been rewritten, to be non-blocking(no synchronized keywords), the cow/overlay feature has had a second-order of speedups, plus other speed fixes. If I 'flatten' the cow/overlay system, so it's not used, the system easily approached 1000req/s(single page at a time). So, I'm still not afraid of these numbers. Now, if you were to combine *all* these hosts into one, and then try to run it, we might have an issue. But, that's not what is currently happening. And, with the 10M number you originally gave, that is 115req/s. If that is for a single page, then our current software handles that fine. If that is for 115 different pages at the same time, then I will have to get back to you, to try that test. I don't have a program that can request 115 different pages at once. Also, if the problem is with remote clients tying up a thread/connection slot, then that is a separate problem from the backend system. The backend system should have a small thread pool, so that it can run fast, and then the frontend(either catalina itself, or apache/mod-jk) does the send using non-blocking-io. 1: http://people.apache.org/~vgritsenko/stats/index.html |
On 22/09/2010 09:09, Adam Heath wrote:
> Now, if you were to combine *all* these hosts into one, and then try to > run it, we might have an issue. But, that's not what is currently > happening. Not sure what you mean here. www.a.o and every tlp.a.o site are served from a single httpd instance (along with a handful of other virtual hosts). Everything we require the cms to handle is currently handled by a single httpd instance. There are actually two machines. One in the EU and one in the US. Normally we use geo-based load balancing but a single machine has to be able to handle all of the traffic comfortably so we can do maintenance on the other. > And, with the 10M number you originally gave, that is 115req/s. If that > is for a single page, then our current software handles that fine. If > that is for 115 different pages at the same time, then I will have to > get back to you, to try that test. I don't have a program that can > request 115 different pages at once. That is across all virtual hosts so they will be different pages. http://www.apache.org/server-status will give you a snapshot of current load. > Also, if the problem is with remote clients tying up a thread/connection > slot, then that is a separate problem from the backend system. The > backend system should have a small thread pool, so that it can run fast, > and then the frontend(either catalina itself, or apache/mod-jk) does the > send using non-blocking-io. The problem is that the systems we have tried before can't handle the load. It sounds like you are using Tomcat under the covers. Whilst you will maximise throughput when the threadpool size is roughly the same as the number of cores on the machine, the overhead of having a few hundred processing threads isn't that great. We would always front Tomcat with httpd so an appropriate mod_proxy/mod_jk/Tomcat connector config can ensure that we don't have to have one Tomcat thread per current connection (since with keep-alive connections >> requests). Mark |
Free forum by Nabble | Edit this page |