Skip to main content

Hi

Since the release of nested workflows I have been looking at suitable use cases for it in our environment. I have sketched up some possible flows, but there are some cases I don`t know how to handle in the best way.

Consider this scenario where we have two source systems (source X and Y) where each system has a source model that is populated via transformations. The transformations are divided into multiple workflows based on what project it gets data from. I have two solution models (model 1 and 2) that gets data from the source systems. Model 2 gets data from both source models, while model 1 only gets data from source X. 

My issue is that the data from the source systems are ingested into CDF raw tables once every morning and the data arrive at different times. If data from source X is in CDF before data from source Y, I want the workflow tasks relevant to solution model 1 to run, but not solution model 2, because the source Y data is not ready yet, however if both sources have ingested data, but source model 1 is already updated, there is no point running this part of the pipeline again.

I might have made this problem a bit more complicated than it needs to be, but in short, I want the data in each solution model to be updated as soon as possible without having to do unnecessary workflow runs. I guess having different scheduled triggers for each source system in the workflow would solve the problem.

From each solution models point of view:

  • Model 1 workflow should be triggered as soon as source model X is updated.
  • Model 2 workflow should be triggered as soon as both sources are updated.

The nested workflow in its simplest form would look something like this, each task in the figure is a sub workflow:

 

Is nested workflows a suitable way to solve this issue, in that case how? Are there any more suitable approaches?

 

If you got to the bottom. Thank you for taking the time to read through this long post! :)

Sebastian

Hi Sebastian

Nested workflows allow you to decompose a complex workflow in multiple re-usable parts. So it could help you structure your workflows in a nicer way and re-use the common parts. But it will not solve your actual start time dependency issue.

We are working on an event trigger solution that should solve your problem but will not be released before a few months. In the mean time, I cannot think of anything more than a scheduled trigger of the workflows unfortunately.

Vincent


Reply