Can Ofbiz be use for the following requirement?
How can I go about achieving this with Ofbiz? I have a folder with Microsoft Word and pdf document format. Can I leveraging Ofbiz to search the contents of these documents, find it then allow to view it? Search and find key words that exist or matched for each document then map that to that document for view. FYI, I posted this question before but some how I can not find it. So, I repost it again. Thanks, -T |
Hi Tom,
I believe what you are looking for is a very custom solution that does not come out of the box with OFBiz. I think you should take a look at apache POI (http://poi.apache.org/) and apache lucene (https://lucene.apache.org/). Together you can use both solutions to access microsoft documents and index them for searching. Another workaround is to integrate with a document management system that supports indexed search of binary documents. There are many open source solutions out there that live on the JVM. Regards, ----- Original Message ----- From: "Tom Running" <[hidden email]> To: [hidden email] Sent: Monday, 18 May, 2015 8:32:48 PM Subject: Ofbiz search engine Can Ofbiz be use for the following requirement? How can I go about achieving this with Ofbiz? I have a folder with Microsoft Word and pdf document format. Can I leveraging Ofbiz to search the contents of these documents, find it then allow to view it? Search and find key words that exist or matched for each document then map that to that document for view. FYI, I posted this question before but some how I can not find it. So, I repost it again. Thanks, -T |
Administrator
|
Tom,
Taher is right, this should also help you https://issues.apache.org/jira/browse/OFBIZ-5042 https://stackoverflow.com/questions/2802000/how-do-i-index-documents-in-solr Jacques Le 18/05/2015 19:58, Taher Alkhateeb a écrit : > Hi Tom, > > I believe what you are looking for is a very custom solution that does not come out of the box with OFBiz. > > I think you should take a look at apache POI (http://poi.apache.org/) and apache lucene (https://lucene.apache.org/). Together you can use both solutions to access microsoft documents and index them for searching. > > Another workaround is to integrate with a document management system that supports indexed search of binary documents. There are many open source solutions out there that live on the JVM. > > Regards, > > ----- Original Message ----- > > From: "Tom Running" <[hidden email]> > To: [hidden email] > Sent: Monday, 18 May, 2015 8:32:48 PM > Subject: Ofbiz search engine > > Can Ofbiz be use for the following requirement? > How can I go about achieving this with Ofbiz? > > I have a folder with Microsoft Word and pdf document format. > Can I leveraging Ofbiz to search the contents of these documents, find it > then allow to view it? > > > Search and find key words that exist or matched for each document then map > that to that document for view. > > FYI, I posted this question before but some how I can not find it. So, I > repost it again. > > Thanks, > -T > > |
In reply to this post by taher
Taher and Jacques,
Thank you for the information. I am wondering if anyone has attempt to integrate such features to Ofbiz? Love to hear and share your opinion. -T On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb <[hidden email] > wrote: > Hi Tom, > > I believe what you are looking for is a very custom solution that does not > come out of the box with OFBiz. > > I think you should take a look at apache POI (http://poi.apache.org/) and > apache lucene (https://lucene.apache.org/). Together you can use both > solutions to access microsoft documents and index them for searching. > > Another workaround is to integrate with a document management system that > supports indexed search of binary documents. There are many open source > solutions out there that live on the JVM. > > Regards, > > ----- Original Message ----- > > From: "Tom Running" <[hidden email]> > To: [hidden email] > Sent: Monday, 18 May, 2015 8:32:48 PM > Subject: Ofbiz search engine > > Can Ofbiz be use for the following requirement? > How can I go about achieving this with Ofbiz? > > I have a folder with Microsoft Word and pdf document format. > Can I leveraging Ofbiz to search the contents of these documents, find it > then allow to view it? > > > Search and find key words that exist or matched for each document then map > that to that document for view. > > FYI, I posted this question before but some how I can not find it. So, I > repost it again. > > Thanks, > -T > > |
Hi Tom,
I’ve seen a solution for this with a combination of OFBiz and solr using the Tika library to index document contents. See here: https://tika.apache.org This could be a nice todo in addition to the not yet finished solr integration issue OFBIZ-5042, maybe I will investigate it a bit in the near future... Regards, Martin Becker ecomify GmbH www.ecomify.de > Am 20.05.2015 um 21:49 schrieb Tom Running <[hidden email]>: > > Taher and Jacques, > > Thank you for the information. > > I am wondering if anyone has attempt to integrate such features to Ofbiz? > Love to hear and share your opinion. > > -T > > On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb <[hidden email] >> wrote: > >> Hi Tom, >> >> I believe what you are looking for is a very custom solution that does not >> come out of the box with OFBiz. >> >> I think you should take a look at apache POI (http://poi.apache.org/) and >> apache lucene (https://lucene.apache.org/). Together you can use both >> solutions to access microsoft documents and index them for searching. >> >> Another workaround is to integrate with a document management system that >> supports indexed search of binary documents. There are many open source >> solutions out there that live on the JVM. >> >> Regards, >> >> ----- Original Message ----- >> >> From: "Tom Running" <[hidden email]> >> To: [hidden email] >> Sent: Monday, 18 May, 2015 8:32:48 PM >> Subject: Ofbiz search engine >> >> Can Ofbiz be use for the following requirement? >> How can I go about achieving this with Ofbiz? >> >> I have a folder with Microsoft Word and pdf document format. >> Can I leveraging Ofbiz to search the contents of these documents, find it >> then allow to view it? >> >> >> Search and find key words that exist or matched for each document then map >> that to that document for view. >> >> FYI, I posted this question before but some how I can not find it. So, I >> repost it again. >> >> Thanks, >> -T >> >> |
Martin,
That would be very nice to have Tika integrate with Ofbiz. Tika seems to be a good solution. I like to help out whatever I can. Let me know. It sounds like Solr is already package with Ofbiz. Now, I need to figure out how to integrate with the Tika's Paser API for the eCommerce's Web site searching and indexing capability. Is this a good assumption? T On Thu, May 21, 2015 at 7:52 AM, Martin Becker <[hidden email]> wrote: > Hi Tom, > > I’ve seen a solution for this with a combination of OFBiz and solr using > the Tika library to index document contents. > See here: https://tika.apache.org > > This could be a nice todo in addition to the not yet finished solr > integration issue OFBIZ-5042, maybe I will investigate it a bit in the near > future... > > Regards, > > Martin Becker > ecomify GmbH > www.ecomify.de > > > > Am 20.05.2015 um 21:49 schrieb Tom Running <[hidden email]>: > > > > Taher and Jacques, > > > > Thank you for the information. > > > > I am wondering if anyone has attempt to integrate such features to Ofbiz? > > Love to hear and share your opinion. > > > > -T > > > > On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb < > [hidden email] > >> wrote: > > > >> Hi Tom, > >> > >> I believe what you are looking for is a very custom solution that does > not > >> come out of the box with OFBiz. > >> > >> I think you should take a look at apache POI (http://poi.apache.org/) > and > >> apache lucene (https://lucene.apache.org/). Together you can use both > >> solutions to access microsoft documents and index them for searching. > >> > >> Another workaround is to integrate with a document management system > that > >> supports indexed search of binary documents. There are many open source > >> solutions out there that live on the JVM. > >> > >> Regards, > >> > >> ----- Original Message ----- > >> > >> From: "Tom Running" <[hidden email]> > >> To: [hidden email] > >> Sent: Monday, 18 May, 2015 8:32:48 PM > >> Subject: Ofbiz search engine > >> > >> Can Ofbiz be use for the following requirement? > >> How can I go about achieving this with Ofbiz? > >> > >> I have a folder with Microsoft Word and pdf document format. > >> Can I leveraging Ofbiz to search the contents of these documents, find > it > >> then allow to view it? > >> > >> > >> Search and find key words that exist or matched for each document then > map > >> that to that document for view. > >> > >> FYI, I posted this question before but some how I can not find it. So, I > >> repost it again. > >> > >> Thanks, > >> -T > >> > >> > > |
Martin,
I have downloaded and built the Tika. I need solr cell ExtractingRequestHandler. Do you know if I have to I have to download the whole Solr package? Does it require the whole Solr pacakage to run or I only need the Solr ExtractingRequestHandle jar file? Can I use the exiting Solr come with Ofbiz? I am assuming Solr has been package with Ofbiz. Or do I have to implement the new Solr ? If you guys can point me to the right direction I can try to build and implement Tika and Solr with Ofbiz. Thanks, T On Fri, May 22, 2015 at 11:56 AM, Tom Running <[hidden email]> wrote: > Martin, > > That would be very nice to have Tika integrate with Ofbiz. Tika seems to > be a good solution. I like to help out whatever I can. Let me know. > It sounds like Solr is already package with Ofbiz. > > Now, I need to figure out how to integrate with the Tika's Paser API for > the eCommerce's Web site searching and indexing capability. Is this a good > assumption? > > T > > On Thu, May 21, 2015 at 7:52 AM, Martin Becker <[hidden email]> > wrote: > >> Hi Tom, >> >> I’ve seen a solution for this with a combination of OFBiz and solr using >> the Tika library to index document contents. >> See here: https://tika.apache.org >> >> This could be a nice todo in addition to the not yet finished solr >> integration issue OFBIZ-5042, maybe I will investigate it a bit in the near >> future... >> >> Regards, >> >> Martin Becker >> ecomify GmbH >> www.ecomify.de >> >> >> > Am 20.05.2015 um 21:49 schrieb Tom Running <[hidden email]>: >> > >> > Taher and Jacques, >> > >> > Thank you for the information. >> > >> > I am wondering if anyone has attempt to integrate such features to >> Ofbiz? >> > Love to hear and share your opinion. >> > >> > -T >> > >> > On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb < >> [hidden email] >> >> wrote: >> > >> >> Hi Tom, >> >> >> >> I believe what you are looking for is a very custom solution that does >> not >> >> come out of the box with OFBiz. >> >> >> >> I think you should take a look at apache POI (http://poi.apache.org/) >> and >> >> apache lucene (https://lucene.apache.org/). Together you can use both >> >> solutions to access microsoft documents and index them for searching. >> >> >> >> Another workaround is to integrate with a document management system >> that >> >> supports indexed search of binary documents. There are many open source >> >> solutions out there that live on the JVM. >> >> >> >> Regards, >> >> >> >> ----- Original Message ----- >> >> >> >> From: "Tom Running" <[hidden email]> >> >> To: [hidden email] >> >> Sent: Monday, 18 May, 2015 8:32:48 PM >> >> Subject: Ofbiz search engine >> >> >> >> Can Ofbiz be use for the following requirement? >> >> How can I go about achieving this with Ofbiz? >> >> >> >> I have a folder with Microsoft Word and pdf document format. >> >> Can I leveraging Ofbiz to search the contents of these documents, find >> it >> >> then allow to view it? >> >> >> >> >> >> Search and find key words that exist or matched for each document then >> map >> >> that to that document for view. >> >> >> >> FYI, I posted this question before but some how I can not find it. So, >> I >> >> repost it again. >> >> >> >> Thanks, >> >> -T >> >> >> >> >> >> > |
Administrator
|
For Solr, so far there is only https://issues.apache.org/jira/browse/OFBIZ-5042
It's still a WIP though it works if you take care of the conflict with Lucene as explained there The security issue should not be a problem, though we have to care/check about the Solr admin application. Jacques Le 28/05/2015 21:49, Tom Running a écrit : > Martin, > > I have downloaded and built the Tika. > I need solr cell ExtractingRequestHandler. Do you know if I have to I have > to download the whole Solr package? Does it require the whole Solr pacakage > to run or I only need the Solr ExtractingRequestHandle jar file? > > Can I use the exiting Solr come with Ofbiz? I am assuming Solr has been > package with Ofbiz. > > Or do I have to implement the new Solr ? > > If you guys can point me to the right direction I can try to build and > implement Tika and Solr with Ofbiz. > > Thanks, > T > > > On Fri, May 22, 2015 at 11:56 AM, Tom Running <[hidden email]> wrote: > >> Martin, >> >> That would be very nice to have Tika integrate with Ofbiz. Tika seems to >> be a good solution. I like to help out whatever I can. Let me know. >> It sounds like Solr is already package with Ofbiz. >> >> Now, I need to figure out how to integrate with the Tika's Paser API for >> the eCommerce's Web site searching and indexing capability. Is this a good >> assumption? >> >> T >> >> On Thu, May 21, 2015 at 7:52 AM, Martin Becker <[hidden email]> >> wrote: >> >>> Hi Tom, >>> >>> I’ve seen a solution for this with a combination of OFBiz and solr using >>> the Tika library to index document contents. >>> See here: https://tika.apache.org >>> >>> This could be a nice todo in addition to the not yet finished solr >>> integration issue OFBIZ-5042, maybe I will investigate it a bit in the near >>> future... >>> >>> Regards, >>> >>> Martin Becker >>> ecomify GmbH >>> www.ecomify.de >>> >>> >>>> Am 20.05.2015 um 21:49 schrieb Tom Running <[hidden email]>: >>>> >>>> Taher and Jacques, >>>> >>>> Thank you for the information. >>>> >>>> I am wondering if anyone has attempt to integrate such features to >>> Ofbiz? >>>> Love to hear and share your opinion. >>>> >>>> -T >>>> >>>> On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb < >>> [hidden email] >>>>> wrote: >>>>> Hi Tom, >>>>> >>>>> I believe what you are looking for is a very custom solution that does >>> not >>>>> come out of the box with OFBiz. >>>>> >>>>> I think you should take a look at apache POI (http://poi.apache.org/) >>> and >>>>> apache lucene (https://lucene.apache.org/). Together you can use both >>>>> solutions to access microsoft documents and index them for searching. >>>>> >>>>> Another workaround is to integrate with a document management system >>> that >>>>> supports indexed search of binary documents. There are many open source >>>>> solutions out there that live on the JVM. >>>>> >>>>> Regards, >>>>> >>>>> ----- Original Message ----- >>>>> >>>>> From: "Tom Running" <[hidden email]> >>>>> To: [hidden email] >>>>> Sent: Monday, 18 May, 2015 8:32:48 PM >>>>> Subject: Ofbiz search engine >>>>> >>>>> Can Ofbiz be use for the following requirement? >>>>> How can I go about achieving this with Ofbiz? >>>>> >>>>> I have a folder with Microsoft Word and pdf document format. >>>>> Can I leveraging Ofbiz to search the contents of these documents, find >>> it >>>>> then allow to view it? >>>>> >>>>> >>>>> Search and find key words that exist or matched for each document then >>> map >>>>> that to that document for view. >>>>> >>>>> FYI, I posted this question before but some how I can not find it. So, >>> I >>>>> repost it again. >>>>> >>>>> Thanks, >>>>> -T >>>>> >>>>> >>> |
Free forum by Nabble | Edit this page |