Re: data related questions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: data related questions

David E Jones-2

On Aug 14, 2006, at 7:52 AM, <[hidden email]> wrote:

> A few more questsions if you don't mind:
>
> (1) We will need an ETL tool to build our data conversions.  I've  
> been looking at the Open Source offerings in this area, and the  
> only one I've come across that might fit the bill at this time is  
> Kettle.  They now appear to be under the Pentaho umbrella, and I'm  
> not sure if that would affect their viability with OFBiz.  The  
> other tool I've been considering is Microsoft's SSIS (SQL Server  
> Integration Services).  If you're not familiar with that product  
> it's the bundled ETL tool that comes with SQL Server 2005. I've  
> extensively used DTS, which is SSIS's precursor, and I think it  
> would be a pretty good ETL platform. (It also works seamlessly with  
> my OLE DB provider for SAP).  Do you see any problem with using  
> SSIS?  Or do you have any other suggestions for a more appropriate  
> ETL tool?

There shouldn't be any problem. My experience with DTS/SSIS is  
limited, but we had one fairly large client that used MS SQL server  
as their OFBiz database, and then used the various SQL Server tools,  
like SSIS, for their data management needs. So yes, that should be fine.

This is one of the framework areas that may grow over time. We do  
have some tools that make data querying and transformation easier and  
that may eventually form a basis for an ETL and other migration and  
data warehouse oriented tools in OFBiz. The benefit here would be to  
be able to re-use existing artifacts, and to have as much as possible  
running in a consistent set of tools.

But no, this doesn't exist yet, and this sort of thing can be  
effectively managed on the data level so this sort of approach is  
basically what everyone does right now (who needs this sort of  
thing...).

> (2) What do you think is the best way to initially populate the  
> OFBiz database - through the classes supplied in the ERP or via  
> direct updating of the underlying tables?

It mostly depends on where the data is coming from. For the seed and  
demo data in OFBiz we use XML files that follow a format that is  
handled automatically by the Entity Engine based on the same entity  
definitions that tables and such are created from. This is a  
convenient way to have seed and test data in your code repository  
that can be used to build a fresh database very easily ("ant run-
install") and fairly quickly.

If you have data coming from other databases or systems then pushing  
it into the database directly is fine. There are certain fields that  
are derived from other fields and you can avoid manually managing  
these by loading data through the Entity Engine, but especially for  
initial imports we usually just do database-to-database SQL scripts  
to move data around. Especially for large databases this is the  
fastest and most efficient way to do it, though it does require that  
the data be in a database driven by the same software as the OFBiz  
database will be using. Unless, of course, you have a good ETL tool  
to move between different databases...

> (3) My BI partners (and most BI vendors in general) use Windows on  
> the client.  Do you see any reason why they shouldn't use ODBC to  
> read the OFBiz database?

For read-only purposes you can have at it at pretty much any time,  
even when the system is in production. For heavy query loads you'll  
want to move your data into a data warehouse periodically and query  
on that instead of the transactional database. This is standard  
practice, of course, to avoid problems with overloaded hardware, or  
locking conflicts and the like.

-David