Data Engineer Basics - Integrations


Userlevel 2
Badge +5

Welcome to the Data Engineer Basics - Integrations!

This discussion is dedicated to help learners of the Data Engineer Basics – Integrations learning path succeed. If you’re struggling any of the courses in this learning path, post a comment with the challenge you’re facing. You can also post your own tips and respond to fellow learners’ questions. Cognite Academy’s instructors are also here to help.


43 replies

Hi @ikequarsh 


After you

pip install cognite-extractor-manager

and initialize the extractor project

cogex init

During this process poetry sets up a virtual environment and installs python modules into the venv. But you need to set / activate that virtual environment in your IDE, if it happens automatically or not I'm not sure. But just check that the python interpreter for your IDE’s project is set to: “your-extractor-directory”/venv/bin/python 

more native Poetry support was added to Pycharm last year(2021) so there can be some differences there now than what the video from what the course says; if it is pycharm you are using, but I'm not that familiar with Pycharm, cause I mainly use VS code or Spyder, But it should be simmilar. 

Then I hope you can find and import all modules and objects inside your pycharm project, that poetry installed in to the venv during the “cogex init” process. 

If you are not familiar with poetry I suggest looking it up. because it can be seen as the npm of python. that both sets up virtual environments (isolation) and handles package/module installing and versioning (dependency resolver). which previous needed several applications like venv or virtualenv and  pip. but poetry does all this service. 

Hope this helps you solve your issue. 

Badge +1

I have the same issue as Miguel, thanks

Userlevel 2
Badge +6

Hi Toye and Miguel, 

We have fixed the problem at our end and everything is working fine at my end.

Please can you run the code again and check if the problem is solved at your end.

Regards 
Kumar

 

Badge +1

Yes thanks its working for data_sets but I get the same error when I try to create the asset

 

assets = [Asset(name='world',data_set_id=xxxxxxx, description='world asset')]

tra = client.assets.create(assets)

 

Thanks

Toye

Userlevel 2
Badge +6

Hi Toye, 
Thanks for posting the question here.
 Please can you mention in which lesson of the course you are facing the problem, so I will test the code at my end and investigate in details. 

Regards 
Kumar

Userlevel 2
Badge +6

Hi Jaydeep,

I have forward your problem to the our data engineers team and they are investigating the  problem and you will get back to you.

Regards 
Kumar

 

Userlevel 2

Hi @Miguel Rosado and @Toye ,

Are you still having problems creating assets? 

Badge +1

Hi @Miguel Rosado and @Toye ,

Are you still having problems creating assets? 

Hello!
As Toye mentioned a couple of days ago, that is correct.
WHen Creating an asset, it would show the same permissions error. If i am to guess, this probably applies to creating any kind of item from the SDK (Since the same happened with data sets)
Here are the screenshots after running my code

The error reads:

CogniteAPIError: Resource not found. This may also be due to insufficient access rights. | code: 403 | X-Request-ID: ec318c97-4c11-94e3-b268-1522f4c88b22The API Failed to process some items.Successful (2xx): []Unknown (5xx): []Failed (4xx): ['worldMRosado_dataset']

 

Userlevel 2

Hi again, very strange I must say, we have checked permissions on our end and can not find the issue. Can you run the notebook 1_Authentication.ipynb until the cell with the code:

client.iam.token.inspect()

Then paste all the output into a simple txt file and post it here? Thank you for your patience and cooperation.

Badge +1

Hi 

I am new to Data engineer topic, i tried to connect database but unable to do that. using following variable. Please help me  connecting data source.

 

TENANT_ID = '48d5043c-cf70-4c49-881c-c638f5796997'CLIENT_ID = '1b90ede3-271e-401b-81a0-a4d52bea3273'CDF_CLUSTER = 'westeurope-1' COGNITE_PROJECT = 'publicdata'SCOPES = [f'https://{CDF_CLUSTER}.cognitedata.com/.default']AUTHORITY_HOST_URI = 'https://login.microsoftonline.com'AUTHORITY_URI = AUTHORITY_HOST_URI + '/' + TENANT_ID
Userlevel 1
Badge +1

Hi @Shaileshkumar Od 

Could you please refer to this code here and try to authenticate accordingly and let us know if this sorts your issue here? Also for further information, please refer to the latest python SDK documentation here.

Best regards,

Dilesha

Badge +1

Hi ,

 

I completed all section of the course (Data Engineer Basics - Integrations) but still completion status showing 84 % and PostgreSQL section showing pending. I am unable to take assessment test of Data Engineer Basics - Integrations!

 

Kindly resolve my issue to complete  Data Engineer Basics – Integrations learning path 

Regards

Mirtunjay 

Userlevel 2
Badge +5

Hi @Mirtunjay Kumar,

Can you try again now? The assessment should be unlocked now and the PostegreSQL course completed. 

Let us know if you have any more questions. 

 

Best regards,

Maritsa

Userlevel 1
Badge +1

Hello @Maritsa Sarri,

I am new to Cognite and started the Data Engineer training after the foundation training. 

I need help on the data extraction part. From the online self paced training, the source was defined but the destination was left out. How do I go about it? 

Thanks

Isaac

Userlevel 4

Hi @ikequarsh ,
I’m Enikö from Cognite Academy. First, congratulations on your Cognite Date Fusion Fundamentals certificate!

I see you’re working through the Extractor-utils Library for Cognite Python SDK course. I suggest you watch the Defining a Code Schema video that talks about defining the destination.

Let me know if the video has answered your question!
Happy learning!

Enikö

Userlevel 1
Badge +1

Hello @Eniko Farkas

Thanks for your quick response to my enquiry. Please can you share the link to the Defining a Code Schema video? Below is all I see in the Data Engineering training.

 

Userlevel 2
Badge +1

Hi @ikequarsh,

You will find the ‘Defining a Code Schema’ video in the ‘Extractor-utils Library for Cognite Python SDK’ course. As you are inside the course, you can go to the ‘Build Your Own Python-based Extractor - Part (1 of 2)’ lesson, where you need to click on the ‘Defining a Code Schema' tab; on doing so, it will take you to the video. 🙂
 


Please find the link below for easy navigation:
https://learn.cognite.com/path/data-engineer-basics-integrations/extractor-utils-library-for-cognite-python-sdk/114686

Let us know if you have any queries. Happy learning!
Nimesh Madandas

Userlevel 1
Badge +1

I think I did that earlier. I am sure there is something wrong with my PyCharm and need to reconfigure the setup again. It is not able to see some if not all the Objects like “RawDestinationConfig” and others. Can someone help? 

Userlevel 1
Badge +1

Hello @Stig Harald Gustavsen,

Thanks for your response. I can use VS Code and is it possible to guide on how to setup with Cogex? 

I think your training materials need to be revisited since it expects user to have knowledge with Cogex.

Let me hear from you. 

Isaac

Userlevel 1
Badge +1

@Stig Harald Gustavsen

I want to clean up my machine and reinstall the Cognite, is there any good way of doing since pip uninstall couldn’t get it done. 

 

Hi @ikequarsh 

Cogex is a command line tool that you get when you pip install cognite-extractor-manager
Hence CogEx short for Cognite Extractor. 

It should be the same way of setting it up.

Cogex init, prefills a directory with files, and makes virtual environment, and installs into the virtual enviroment the python modules specific to the extractors base project. and you just activate / set up the python interpreter inside your IDE to the virtual environment created  by poetry (“your-extractor-directory”/venv/bin/python ), that was ran when you ran cogex init. but it would be the same way if you manually setup a virtual environment and added the files and ran pip install requirement.txt as when you work on previous python project templates / boilerplates / forks before more and more python projects went over to poetry for the dependency and virtual environment manager. but now the python modules / libraries are in pyproject.toml file instead of being pip freezed into requirements.txt

As they say in the course video on “importing the new extractor project” you don’t have to use Pycharm for the extractor projects, but you use any IDE/text editor you want for this project. just set up your environment. I personally would use VS code and Windows Subsystem for Linux (WSL), but that is my preference. Sometimes working with virtual environments in windows can be a hassle, but that is where poetry comes in and try to help us.

Are you familiar with working with virtual environment in Python? 

Hope we get one step closer to solving your issue Isaac :)

Badge +1

Hello. I am going through the hands on excercise on the ‘Learn to Use the Cognite Python SDK’ Learning.

As i am trying to do the hands on excersice, i authenticated succesfuylly (Image 1)

 

and when i try to create my dataset (Image 2) wherre everything will be derived from as instructed, i get the following error:

I double checked the ‘answers’ notebook to make sure i wasnt doing anything out of the ordinary and it doesnt seem like it. Is there some permissions needed to do this besides signing up to the course?

Userlevel 2
Badge +6

Hi Miguel, 

Thanks for posting you question here.
Please can add the snapshot of the error/problem and then it will be easy for us to track down the problem.

Waiting for your reply.

Regards 
Kumar

 

Badge +1

Hi Miguel, 

Thanks for posting you question here.
Please can add the snapshot of the error/problem and then it will be easy for us to track down the problem.

Waiting for your reply.

Regards 
Kumar

 

Hello Rahul!
I did add the error there. it reads ‘

CogniteAPIError: Resource not found. This may also be due to insufficient access rights. | code: 403 | X-Request-ID: 64e60cff-ff68-9db5-b8ec-b366b93d6640The API Failed to process some items.Successful (2xx): []Unknown (5xx): []Failed (4xx): ['MRosado_dataset']

 

here is the entirety of the error if it helps.

 

Badge +6

As part of  “Learn to Use the Cognite Python SDK” course in Data integration , while running “test_Authentication.ipynb” notebook , I am getting below error on line “c.login.status()”

 

--> 703 httplib_response = self._make_request( 704 conn, 705 method, 706 url, 707 timeout=timeout_obj, 708 body=body, 709 headers=headers, 710 chunked=chunked, 711 ) 713 # If we're going to release the connection in ``finally:``, then 714 # the response doesn't need to know about the connection. Otherwise 715 # it will also try to release it and we'll have a double-release 716 # mess. File c:\Users\j.subhash.parandekar\AppData\Local\Programs\Python\Python38\lib\site-packages\urllib3\connectionpool.py:386, in HTTPConnectionPool._make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw) 385 try: --> 386 self._validate_conn(conn) 387 except (SocketTimeout, BaseSSLError) as e: 388 # Py2 raises this as a BaseSSLError, Py3 raises it as socket timeout. File c:\Users\j.subhash.parandekar\AppData\Local\Programs\Python\Python38\lib\site-packages\urllib3\connectionpool.py:1042, in HTTPSConnectionPool._validate_conn(self, conn)

...

150 raise CogniteConnectionRefused from e --> 151 raise CogniteConnectionError from e 152 raise e CogniteConnectionError:

Reply