Question

Best practice for joining data in models

  • 26 April 2024
  • 0 replies
  • 21 views

Badge

I am creating a transformation where I am joining data from two different views/containers, where table B has a node reference to table A. I have tried to find documentation for this, but I have not found any so far. 

 

Through friends and trial and error I have found two options, and neither seems to be performing well. 

Example:

Type A {

    Name: String

}

Type B{

   Name: String

   A_ref: A

}

What I have found as possible solutions are to go through cdf_data_models and picking the externalId from the node-reference:
from cdf_data models(<spc>, <mod>,<ver>, “A”) as A join cdf_… as B on A.external_id = B.A_ref.externalId

Same as above, but join on a as a nodereference: 

on B.A_ref = node_reference(<spc>, A) 

 

 

Neither of these options seems to be documented anywhere, and I can’t find any other ways documented. The performance when reading when using these joins seems slow, even though I have set up a few indexes which should cover the different joins. This is also slower then when reading from RAW for a comparable data set. 


0 replies

Be the first to reply!

Reply