[jira] [Commented] (OFBIZ-10953) have CurrencyDimension have a dimensionId that is based on the natural key

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (OFBIZ-10953) have CurrencyDimension have a dimensionId that is based on the natural key

Nicolas Malin (Jira)

    [ https://issues.apache.org/jira/browse/OFBIZ-10953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17032320#comment-17032320 ]

Pierre Smits commented on OFBIZ-10953:
--------------------------------------

The reasoning we applied to having a different primary key for the Uom table in the transaction db (an exception to the primary key definition for majority of the tables) applies also to the DWH.

For those not having access to the Data Warehouse Toolkit book (which is the starting point for the project's approach to Bi through the same named component) the first line states:

??Surrogate keys are used to implement the primary keys of *almost all* dimension tables.??

So, which dimension tables can the project consider to be the exceptions?
 These are the generic dimension tables that are uniform in meaning across businesses, e.g. those data sets defined by standardisation bodies.
 Measurement dimension tables, like date, time, but also country, currency and all those other groups of records which are relating to records in the Uom table in the transaction db are such generic tables. These tables don't need a 'self defined' primary keys added, because the natural key is self-explanatory.

That is why Kimball in his book suggests for the Date dimension table to use the ISO 8601 (specifically YYMMDD) for the date as the definition for the primary key. And the same is suggested for the Time dimension table (HHMM in that case).

The validation for these exceptions are:
 # the natural key is self-explanatory,
 # data abstraction negatively impacts query performance and legibility.

Compare following examples:
{code:java}
<SalesInvoiceItemFact billToCustomerDimId="_NA_" contributionAmount="2250.000" createdStamp="2019-05-29 10:05:01.0" createdTxStamp="2019-05-29 10:03:24.0" customerDimId="12001" extDiscountAmount="0.000" extDistributionCost="0.000" extGrossAmount="2250.000" extManFixedCost="0.000" extManVarCost="0.000" extNetAmount="2250.000" extStorageCost="0.000" extTaxAmount="0.000" invoiceDateDimId="20151008" invoiceId="95" invoiceItemSeqId="00001" lastUpdatedStamp="2019-05-29 10:05:01.0" lastUpdatedTxStamp="2019-05-29 10:03:24.0" orderId="_NA_" organisationDimId="10010" origCurrencyDimId="EUR" productDimId="10087" quantity="1.000000" quantityUomDimId="10"/>
{code}
vs
{code:java}
<SalesInvoiceItemFact billToCustomerDimId="_NA_" contributionAmount="2250.000" createdStamp="2019-05-29 10:05:01.0" createdTxStamp="2019-05-29 10:03:24.0" customerDimId="12001_" extDiscountAmount="0.000" extDistributionCost="0.000" extGrossAmount="2250.000" extManFixedCost="0.000" extManVarCost="0.000" extNetAmount="2250.000" extStorageCost="0.000" extTaxAmount="0.000" invoiceDateDimId="20151008" invoiceId="95" invoiceItemSeqId="00001" lastUpdatedStamp="2019-05-29 10:05:01.0" lastUpdatedTxStamp="2019-05-29 10:03:24.0" orderId="_NA_" organisationDimId="10010" origCurrencyDimId="10000" productDimId="10087" quantity="1.000000" quantityUomDimId="LEN_m"/>
{code}
The latter means that in a production infrastructure the using company is penalised (performance and cost-wise) with an additional query (for the *origCurrencyDimId*) to the currency dimension to retrieve the underlying explanation/meaning (EUR). And similarly for the *quantityUomDimId* and other generically defined measurements (e.g. in the examples the  *invoiceDateDimId*).

This may not mean much in DHW implementations with a low number of records in fact tables. But when there a multitude of such fact tables with each having millions/billions of records in those tables using such keys of generic tables (these measurement tables), then it adds up significantly. 

> have CurrencyDimension have a dimensionId that is based on the natural key
> --------------------------------------------------------------------------
>
>                 Key: OFBIZ-10953
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-10953
>             Project: OFBiz
>          Issue Type: Improvement
>          Components: bi
>    Affects Versions: Trunk, Release Branch 17.12, Release Branch 18.12
>            Reporter: Pierre Smits
>            Assignee: Pierre Smits
>            Priority: Major
>              Labels: CurrencyDimension, birt, currency, dimension, dwh
>         Attachments: OFBIZ-10953-BI.patch
>
>
> Currently the record sequencer (delegator.getNextSeqId) is used to determine the dimensionId for the CurrencyDimension. This is unnecessary as the uomId from the UOM table can be used for currency.
> It also makes it easier to set the foreign-key in fact tables by generating it based on the date provided, than by retrieving the dimensionId based on a retrieval through the getDimensionIdFromNaturalKey service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)