OFBiz › OFBiz - User

Ofbiz search engine

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

8 messages Options

Tom Running

Ofbiz search engine

Can Ofbiz be use for the following requirement?
How can I go about achieving this with Ofbiz?

I have a folder with Microsoft Word and pdf document format.
Can I leveraging Ofbiz to search the contents of these documents, find it
then allow to view it?

Search and find key words that exist or matched for each document then map
that to that document for view.

FYI, I posted this question before but some how I can not find it. So, I
repost it again.

Thanks,
-T

taher

Re: Ofbiz search engine

Hi Tom,

I believe what you are looking for is a very custom solution that does not come out of the box with OFBiz.

I think you should take a look at apache POI (http://poi.apache.org/) and apache lucene (https://lucene.apache.org/). Together you can use both solutions to access microsoft documents and index them for searching.

Another workaround is to integrate with a document management system that supports indexed search of binary documents. There are many open source solutions out there that live on the JVM.

Regards,

----- Original Message -----

From: "Tom Running" <[hidden email]>
To: [hidden email]
Sent: Monday, 18 May, 2015 8:32:48 PM
Subject: Ofbiz search engine

Can Ofbiz be use for the following requirement?
How can I go about achieving this with Ofbiz?

I have a folder with Microsoft Word and pdf document format.
Can I leveraging Ofbiz to search the contents of these documents, find it
then allow to view it?

Search and find key words that exist or matched for each document then map
that to that document for view.

FYI, I posted this question before but some how I can not find it. So, I
repost it again.

Thanks,
-T

Jacques Le Roux

Re: Ofbiz search engine

Administrator

Tom,

Taher is right, this should also help you
https://issues.apache.org/jira/browse/OFBIZ-5042
https://stackoverflow.com/questions/2802000/how-do-i-index-documents-in-solr

Jacques

Le 18/05/2015 19:58, Taher Alkhateeb a écrit :

> Hi Tom,
>
> I believe what you are looking for is a very custom solution that does not come out of the box with OFBiz.
>
> I think you should take a look at apache POI (http://poi.apache.org/) and apache lucene (https://lucene.apache.org/). Together you can use both solutions to access microsoft documents and index them for searching.
>
> Another workaround is to integrate with a document management system that supports indexed search of binary documents. There are many open source solutions out there that live on the JVM.
>
> Regards,
>
> ----- Original Message -----
>
> From: "Tom Running" <[hidden email]>
> To: [hidden email]
> Sent: Monday, 18 May, 2015 8:32:48 PM
> Subject: Ofbiz search engine
>
> Can Ofbiz be use for the following requirement?
> How can I go about achieving this with Ofbiz?
>
> I have a folder with Microsoft Word and pdf document format.
> Can I leveraging Ofbiz to search the contents of these documents, find it
> then allow to view it?
>
>
> Search and find key words that exist or matched for each document then map
> that to that document for view.
>
> FYI, I posted this question before but some how I can not find it. So, I
> repost it again.
>
> Thanks,
> -T
>
>

Tom Running

Re: Ofbiz search engine

In reply to this post by taher

Taher and Jacques,

Thank you for the information.

I am wondering if anyone has attempt to integrate such features to Ofbiz?
Love to hear and share your opinion.

-T

On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb <[hidden email]
> wrote:

> Hi Tom,
>
> I believe what you are looking for is a very custom solution that does not
> come out of the box with OFBiz.
>
> I think you should take a look at apache POI (http://poi.apache.org/) and
> apache lucene (https://lucene.apache.org/). Together you can use both
> solutions to access microsoft documents and index them for searching.
>
> Another workaround is to integrate with a document management system that
> supports indexed search of binary documents. There are many open source
> solutions out there that live on the JVM.
>
> Regards,
>
> ----- Original Message -----
>
> From: "Tom Running" <[hidden email]>
> To: [hidden email]
> Sent: Monday, 18 May, 2015 8:32:48 PM
> Subject: Ofbiz search engine
>
> Can Ofbiz be use for the following requirement?
> How can I go about achieving this with Ofbiz?
>
> I have a folder with Microsoft Word and pdf document format.
> Can I leveraging Ofbiz to search the contents of these documents, find it
> then allow to view it?
>
>
> Search and find key words that exist or matched for each document then map
> that to that document for view.
>
> FYI, I posted this question before but some how I can not find it. So, I
> repost it again.
>
> Thanks,
> -T
>
>

Martin Becker

Re: Ofbiz search engine

Hi Tom,

I’ve seen a solution for this with a combination of OFBiz and solr using the Tika library to index document contents.
See here: https://tika.apache.org

This could be a nice todo in addition to the not yet finished solr integration issue OFBIZ-5042, maybe I will investigate it a bit in the near future...

Regards,

Martin Becker
ecomify GmbH
www.ecomify.de

> Am 20.05.2015 um 21:49 schrieb Tom Running <[hidden email]>:
>
> Taher and Jacques,
>
> Thank you for the information.
>
> I am wondering if anyone has attempt to integrate such features to Ofbiz?
> Love to hear and share your opinion.
>
> -T
>
> On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb <[hidden email]
>> wrote:
>
>> Hi Tom,
>>
>> I believe what you are looking for is a very custom solution that does not
>> come out of the box with OFBiz.
>>
>> I think you should take a look at apache POI (http://poi.apache.org/) and
>> apache lucene (https://lucene.apache.org/). Together you can use both
>> solutions to access microsoft documents and index them for searching.
>>
>> Another workaround is to integrate with a document management system that
>> supports indexed search of binary documents. There are many open source
>> solutions out there that live on the JVM.
>>
>> Regards,
>>
>> ----- Original Message -----
>>
>> From: "Tom Running" <[hidden email]>
>> To: [hidden email]
>> Sent: Monday, 18 May, 2015 8:32:48 PM
>> Subject: Ofbiz search engine
>>
>> Can Ofbiz be use for the following requirement?
>> How can I go about achieving this with Ofbiz?
>>
>> I have a folder with Microsoft Word and pdf document format.
>> Can I leveraging Ofbiz to search the contents of these documents, find it
>> then allow to view it?
>>
>>
>> Search and find key words that exist or matched for each document then map
>> that to that document for view.
>>
>> FYI, I posted this question before but some how I can not find it. So, I
>> repost it again.
>>
>> Thanks,
>> -T
>>
>>

Tom Running

Re: Ofbiz search engine

Martin,

That would be very nice to have Tika integrate with Ofbiz. Tika seems to
be a good solution. I like to help out whatever I can. Let me know.
It sounds like Solr is already package with Ofbiz.

Now, I need to figure out how to integrate with the Tika's Paser API for
the eCommerce's Web site searching and indexing capability. Is this a good
assumption?

T

On Thu, May 21, 2015 at 7:52 AM, Martin Becker <[hidden email]>
wrote:

> Hi Tom,
>
> I’ve seen a solution for this with a combination of OFBiz and solr using
> the Tika library to index document contents.
> See here: https://tika.apache.org
>
> This could be a nice todo in addition to the not yet finished solr
> integration issue OFBIZ-5042, maybe I will investigate it a bit in the near
> future...
>
> Regards,
>
> Martin Becker
> ecomify GmbH
> www.ecomify.de
>
>
> > Am 20.05.2015 um 21:49 schrieb Tom Running <[hidden email]>:
> >
> > Taher and Jacques,
> >
> > Thank you for the information.
> >
> > I am wondering if anyone has attempt to integrate such features to Ofbiz?
> > Love to hear and share your opinion.
> >
> > -T
> >
> > On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb <
> [hidden email]
> >> wrote:
> >
> >> Hi Tom,
> >>
> >> I believe what you are looking for is a very custom solution that does
> not
> >> come out of the box with OFBiz.
> >>
> >> I think you should take a look at apache POI (http://poi.apache.org/)
> and
> >> apache lucene (https://lucene.apache.org/). Together you can use both
> >> solutions to access microsoft documents and index them for searching.
> >>
> >> Another workaround is to integrate with a document management system
> that
> >> supports indexed search of binary documents. There are many open source
> >> solutions out there that live on the JVM.
> >>
> >> Regards,
> >>
> >> ----- Original Message -----
> >>
> >> From: "Tom Running" <[hidden email]>
> >> To: [hidden email]
> >> Sent: Monday, 18 May, 2015 8:32:48 PM
> >> Subject: Ofbiz search engine
> >>
> >> Can Ofbiz be use for the following requirement?
> >> How can I go about achieving this with Ofbiz?
> >>
> >> I have a folder with Microsoft Word and pdf document format.
> >> Can I leveraging Ofbiz to search the contents of these documents, find
> it
> >> then allow to view it?
> >>
> >>
> >> Search and find key words that exist or matched for each document then
> map
> >> that to that document for view.
> >>
> >> FYI, I posted this question before but some how I can not find it. So, I
> >> repost it again.
> >>
> >> Thanks,
> >> -T
> >>
> >>
>
>

Tom Running

Re: Ofbiz search engine

Martin,

I have downloaded and built the Tika.
I need solr cell ExtractingRequestHandler. Do you know if I have to I have
to download the whole Solr package? Does it require the whole Solr pacakage
to run or I only need the Solr ExtractingRequestHandle jar file?

Can I use the exiting Solr come with Ofbiz? I am assuming Solr has been
package with Ofbiz.

Or do I have to implement the new Solr ?

If you guys can point me to the right direction I can try to build and
implement Tika and Solr with Ofbiz.

Thanks,
T

On Fri, May 22, 2015 at 11:56 AM, Tom Running <[hidden email]> wrote:

> Martin,
>
> That would be very nice to have Tika integrate with Ofbiz. Tika seems to
> be a good solution. I like to help out whatever I can. Let me know.
> It sounds like Solr is already package with Ofbiz.
>
> Now, I need to figure out how to integrate with the Tika's Paser API for
> the eCommerce's Web site searching and indexing capability. Is this a good
> assumption?
>
> T
>
> On Thu, May 21, 2015 at 7:52 AM, Martin Becker <[hidden email]>
> wrote:
>
>> Hi Tom,
>>
>> I’ve seen a solution for this with a combination of OFBiz and solr using
>> the Tika library to index document contents.
>> See here: https://tika.apache.org
>>
>> This could be a nice todo in addition to the not yet finished solr
>> integration issue OFBIZ-5042, maybe I will investigate it a bit in the near
>> future...
>>
>> Regards,
>>
>> Martin Becker
>> ecomify GmbH
>> www.ecomify.de
>>
>>
>> > Am 20.05.2015 um 21:49 schrieb Tom Running <[hidden email]>:
>> >
>> > Taher and Jacques,
>> >
>> > Thank you for the information.
>> >
>> > I am wondering if anyone has attempt to integrate such features to
>> Ofbiz?
>> > Love to hear and share your opinion.
>> >
>> > -T
>> >
>> > On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb <
>> [hidden email]
>> >> wrote:
>> >
>> >> Hi Tom,
>> >>
>> >> I believe what you are looking for is a very custom solution that does
>> not
>> >> come out of the box with OFBiz.
>> >>
>> >> I think you should take a look at apache POI (http://poi.apache.org/)
>> and
>> >> apache lucene (https://lucene.apache.org/). Together you can use both
>> >> solutions to access microsoft documents and index them for searching.
>> >>
>> >> Another workaround is to integrate with a document management system
>> that
>> >> supports indexed search of binary documents. There are many open source
>> >> solutions out there that live on the JVM.
>> >>
>> >> Regards,
>> >>
>> >> ----- Original Message -----
>> >>
>> >> From: "Tom Running" <[hidden email]>
>> >> To: [hidden email]
>> >> Sent: Monday, 18 May, 2015 8:32:48 PM
>> >> Subject: Ofbiz search engine
>> >>
>> >> Can Ofbiz be use for the following requirement?
>> >> How can I go about achieving this with Ofbiz?
>> >>
>> >> I have a folder with Microsoft Word and pdf document format.
>> >> Can I leveraging Ofbiz to search the contents of these documents, find
>> it
>> >> then allow to view it?
>> >>
>> >>
>> >> Search and find key words that exist or matched for each document then
>> map
>> >> that to that document for view.
>> >>
>> >> FYI, I posted this question before but some how I can not find it. So,
>> I
>> >> repost it again.
>> >>
>> >> Thanks,
>> >> -T
>> >>
>> >>
>>
>>
>

Jacques Le Roux

Re: Ofbiz search engine

Administrator

For Solr, so far there is only https://issues.apache.org/jira/browse/OFBIZ-5042
It's still a WIP though it works if you take care of the conflict with Lucene as explained there
The security issue should not be a problem, though we have to care/check about the Solr admin application.

Jacques

Le 28/05/2015 21:49, Tom Running a écrit :

> Martin,
>
> I have downloaded and built the Tika.
> I need solr cell ExtractingRequestHandler. Do you know if I have to I have
> to download the whole Solr package? Does it require the whole Solr pacakage
> to run or I only need the Solr ExtractingRequestHandle jar file?
>
> Can I use the exiting Solr come with Ofbiz? I am assuming Solr has been
> package with Ofbiz.
>
> Or do I have to implement the new Solr ?
>
> If you guys can point me to the right direction I can try to build and
> implement Tika and Solr with Ofbiz.
>
> Thanks,
> T
>
>
> On Fri, May 22, 2015 at 11:56 AM, Tom Running <[hidden email]> wrote:
>
>> Martin,
>>
>> That would be very nice to have Tika integrate with Ofbiz. Tika seems to
>> be a good solution. I like to help out whatever I can. Let me know.
>> It sounds like Solr is already package with Ofbiz.
>>
>> Now, I need to figure out how to integrate with the Tika's Paser API for
>> the eCommerce's Web site searching and indexing capability. Is this a good
>> assumption?
>>
>> T
>>
>> On Thu, May 21, 2015 at 7:52 AM, Martin Becker <[hidden email]>
>> wrote:
>>
>>> Hi Tom,
>>>
>>> I’ve seen a solution for this with a combination of OFBiz and solr using
>>> the Tika library to index document contents.
>>> See here: https://tika.apache.org
>>>
>>> This could be a nice todo in addition to the not yet finished solr
>>> integration issue OFBIZ-5042, maybe I will investigate it a bit in the near
>>> future...
>>>
>>> Regards,
>>>
>>> Martin Becker
>>> ecomify GmbH
>>> www.ecomify.de
>>>
>>>
>>>> Am 20.05.2015 um 21:49 schrieb Tom Running <[hidden email]>:
>>>>
>>>> Taher and Jacques,
>>>>
>>>> Thank you for the information.
>>>>
>>>> I am wondering if anyone has attempt to integrate such features to
>>> Ofbiz?
>>>> Love to hear and share your opinion.
>>>>
>>>> -T
>>>>
>>>> On Mon, May 18, 2015 at 1:58 PM, Taher Alkhateeb <
>>> [hidden email]
>>>>> wrote:
>>>>> Hi Tom,
>>>>>
>>>>> I believe what you are looking for is a very custom solution that does
>>> not
>>>>> come out of the box with OFBiz.
>>>>>
>>>>> I think you should take a look at apache POI (http://poi.apache.org/)
>>> and
>>>>> apache lucene (https://lucene.apache.org/). Together you can use both
>>>>> solutions to access microsoft documents and index them for searching.
>>>>>
>>>>> Another workaround is to integrate with a document management system
>>> that
>>>>> supports indexed search of binary documents. There are many open source
>>>>> solutions out there that live on the JVM.
>>>>>
>>>>> Regards,
>>>>>
>>>>> ----- Original Message -----
>>>>>
>>>>> From: "Tom Running" <[hidden email]>
>>>>> To: [hidden email]
>>>>> Sent: Monday, 18 May, 2015 8:32:48 PM
>>>>> Subject: Ofbiz search engine
>>>>>
>>>>> Can Ofbiz be use for the following requirement?
>>>>> How can I go about achieving this with Ofbiz?
>>>>>
>>>>> I have a folder with Microsoft Word and pdf document format.
>>>>> Can I leveraging Ofbiz to search the contents of these documents, find
>>> it
>>>>> then allow to view it?
>>>>>
>>>>>
>>>>> Search and find key words that exist or matched for each document then
>>> map
>>>>> that to that document for view.
>>>>>
>>>>> FYI, I posted this question before but some how I can not find it. So,
>>> I
>>>>> repost it again.
>>>>>
>>>>> Thanks,
>>>>> -T
>>>>>
>>>>>
>>>