I am doing some ETL jobs in Azure Databricks and have successfully managed to use Cognite’s Spark Data Source to read and write time-series, datapoints etc from and to CDF. I know that databricks itself is a cloud platform. However, it is interesting for me to be able to run some or all of the jobs locally during development phase. I wonder if it is still possible to somehow test-run Spark jobs locally? The configuration does not seem to be trivial.
I played a little bit with PySpark, and I was able to run it on my Mac but I could not create a connection to “cognite.spark.v1” to read or write data.
Do you know if it is possible to perform such operation? If not, what would you suggest?
Best answer by Reza Parseh
View original