Skip to main content
Question

10 mintue Scheduled Workflow not working as expected


I have created 1 workflow , in which I am creating dynamic tasks depending on input, it creates batch of ids and create tasks out of it. Below is workflow definition

WorkflowVersionUpsert(

   workflow_external_id="test_dynamic-0729",

   version="1",

   workflow_definition=WorkflowDefinitionUpsert(

       description="This workflow has two steps",

       tasks=

           WorkflowTask(

                external_id="test_sub_tasks",

                parameters=FunctionTaskParameters(

                    external_id="test_sub_tasks",

                    data="${workflow.input}"

                ),

                retries=1,

                timeout=3600,

                depends_on=1],

                on_failure = "abortWorkflow",

            ),

           WorkflowTask(

                external_id="test_create_sub",

                parameters=DynamicTaskParameters(

                    tasks="${test_sub_tasks.output.response.tasks}"

                ),

                name="Dynamic Task",

                description="Executes a list of workflow tasks for subscription creation",

                retries=0,

                timeout=3600,

                depends_on=e"test_sub_tasks"],

                on_failure = "abortWorkflow",

           )

       ]

   )

Also, I have created function to trigger this workflow. As I want this workflow to fetch new ids and do the required operation, I have scheduled this workflow for 10 minutes. Cron Expression - */10 * * * * 

Issues I am facing:

Workflow schedule is inconsistent. sharing snapshot. No manual trigger was done. 

Impact : As a part of this workflow, I have step to do clean up of staging table at end. Multiple access to same staging table at same time causing failure as tasks are not following schedule and hence, workflow is not able to proceed  further.

Hi @Rimmi Anand.

Could you open (click the “+” icon) on your schedule? This will list all the Function calls made by this specific schedule. Within this list, you should not see any inconsistencies deviating from the schedule defined by the cron expression. Please share a screenshot if you do see inconsistencies within this list.

The first screenshot you posted (the “Calls” tab) contains all calls to the Function, independent of who called it (schedule or manual).


​​​@Jørgen Lund , Sharing screenshot with 10 minute schedule from (click the “+” icon) . Time difference is not 10 minute between any triggered workflow. Let me know if something is missed.


Hello, @Rimmi Anand

 

I work on the functions team, and am happy to take a look at this. However, to do this I need some information from you. 

  • Which CDF-project is this and what cluster is it running on? 
  • What is the function-ID and schedule-ID for the function and schedule in the pictures above?
  • Does this occur only for this particular schedule, or have you seen this behavior other places?

To me, it seems like the schedule is executed twice. Every ten minutes, and every ten minutes with a 3 minute offset. This is certainly unintended behavior based on the pictures you’ve provided. 

 

All the best,

Ivar 

 


Hi @Ivar Stangeby , Please find answers inline.

  • Which CDF-project is this and what cluster is it running on?                                                             cluster: westeurope-1, project: slb-uds-dev
  • What is the function-ID and schedule-ID for the function and schedule in the pictures above?                     function-ID : 2155713045924913                                                                                                             schedule-ID : 3025103300044215
  • Does this occur only for this particular schedule, or have you seen this behavior other places?                  Currently for this particular schedule  

Thanks! 

We’ll do some investigation on our end, and get back to you. 

 

 


Hello again!

 

I’ve been doing some digging, and it’s hard to say exactly what is causing this. I haven’t been able to pinpoint the reason why that exact cron-expression has been triggered twice. 

In the interest of unblocking, does the issue resolve itself if you recreate the schedule/workflow? 

I’ll keep looking tomorrow. 


I have tried creating schedule with 10 minute cron-expression using SDK and from UI, similar results. To unblock , am using higher schedule window of 20 and 30 minutes, however it will impact on execution of dependent workflows, it will add delay. 


Okay, thanks for giving it a go. 

 

I’ll keep investigating today! 


An additional thing that would be very helpful @Rimmi Anand is if you could provide me with the actual scheduled time for the calls running every 10 minutes, and those running every 10 minutes with a 3 minute offset. 

This can be achieved by printing the `scheduled_time` from the `function_call_info`-parameter passed to the function. Ref doc: https://docs.cognite.com/de/cdf/functions/#the-function-definition

If you define a function handle with the argument as such: 

def handle(..., function_call_info):   # ... represents any other arguments you might need

    print(f“Call was scheduled to run at {function_call_infoc‘scheduled_time’]}”)


# Your original code here.


Maybe this can help us glean some insight.

I’ll try to reproduce the issue on my end today. 


Hello again, @Rimmi Anand !

 

I’ve been unable to reproduce the error on my end. 

Just a question. Have you tried redeploying the function, and scheduling the newly deployed one instead? I am a bit flustered as to why this seems to happen only for the specific function-schedule combination.  


HI @Ivar Stangeby , is there any update on this. Still facing same issue. Tried with different function

  • cluster: westeurope-1, project: slb-uds-dev
  •  function-ID : 673025218246203   
  •  


Hi @Rimmi Anand, you can now use native data workflow triggers rather than a Cognite Function to automate the execution of your workflows. Check out the docs.

 

This should resolve the issue described above, but we’ll follow up on the other related Hub post.


Reply