Impact 2024: The Industrial Data and AI Conference for and by Users | Nominate Speakers Now for a Ch...
Great! Thanks for asking the question also, helped discover some problems in our docs :) Have fun Sparking!
From the previous error it looks like your Spark installation is Spark 3.2.1 on Scala 2.12, and I’m suspecting this might cause that type of issue.. so I’d suggest downloading a Spark release with 2.13 as a start!
Ah right, yes it wants the version (didn’t paste the full coordinate in my latest comment), so please add 2.0.10.WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/homebrew/Cellar/apache-spark/3.2.1/libexec/jars/spark-unsafe_2.12-3.2.1.jar) to constructor java.nio.DirectByteBuffer(long,int)This error mentions spark 2.12, could it also be that your local Spark installation happens to be Spark 3.2.1 on Scala 2.12?
Seems I’m also a little bit out of date here, we made a breaking change some time back and changed the name of the artifact. Does this work?com.cognite.spark.datasource:cdf-spark-datasource-fat_2.13
https://github.com/cognitedata/cdp-spark-datasource/releasesReleases are here, but I see we haven’t been keeping it fully up to date. 2.13 should also be fine, do you get an error when you use ./bin/pyspark --packages com.cognite.spark.datasource:cdf-spark-datasource_2.13:2.0.10?
Hi!Running our Spark Datasource with Spark set up locally should be fine, and if you’re able to run PySpark you should have access to the spark-shell command! Our docs give you a helping hand here https://github.com/cognitedata/cdp-spark-datasource/#quickstart, but the command is simply this./bin/pyspark --packages com.cognite.spark.datasource:cdf-spark-datasource_2.12:2.0.10Make sure to run it from the right directory (the command here should work if run from the folder where you have Spark installed), and please check the Scala version of your local installation and match it (I suggested 2.12 here, which is likely what you have)
A callback URL is a possible solution, we’ll add it to our backlog for consideration :)
Hi Christian, and thanks for your question!Short answer: no, and probably never.The CDF API doesn’t support very long running requests, currently API requests will be cut after 90s (and this is unlikely to change). The wait functionality in the SDK is implemented using polling of the transformation run status, so it gets around the 90s limit in that way. However, we can’t implement that as part of the API.
Already have an account? Login
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.