Impact 2024: The Industrial Data and AI Conference for and by Users | Nominate Speakers Now for a Ch...
Mysterious. Is it possible that you imported the CogniteClient before numpy was installed? If you restart the notebook are you able to reproduce the error?
Hmm… this seems like an impossible bug (obviously not though 😅). The error originates after a call to retrieve_dataframe, which does a test import for both numpy and pandas to make sure both are available. This test passes for some reason 🤔 However, a previous test import of numpy has failed, leading to the SDK thinking it has been installed without numpy support. I don’t see from the code how this is possible.Could you try to print the following:from cognite.client.data_classes.datapoints import NUMPY_IS_AVAILABLE
Thanks for reporting this! Could you double-check your virtual environment packages? This looks like a bug that was fixed in 5.7.3What is the output of running:from numpy.version import versionprint(version)from cognite.client import __version__print(__version__)
I’d like to add a little to @Johannes Hovda answer:1) Start/endBe careful using start=0 (1970-01-01) as the time series API support timestamps all the way back to the year 1900. Following the examples in the documentation, you may import, for your convenience, the very first possible timestamp to be dead sure not to miss anything!The oppositely is true for end, which defaults to "now" if not specified. Thus, if a time series first datapoint lies into the future, you need to specify end to retrieve it. API supports up to 2099-12-31 23:59:59.999>>> from cognite.client.utils import MIN_TIMESTAMP_MS, MAX_TIMESTAMP_MS>>> dps_backup = client.time_series.data.retrieve(... id=123,... start=MIN_TIMESTAMP_MS,... end=MAX_TIMESTAMP_MS + 1) # end is exclusive2) Efficient queriesYou write that you have “thousands of time series” that you need to find the initial datapoint of and this could be very inefficient to query, if not done correctly.The time series API allows f
Your mock_asset, is not a mock. You could just use MagicMock directly, or better, spec the Asset data class:from unittest.mock import create_autospecfrom cognite.client.data_classes import Assetmock_asset = create_autospec(Asset, spec_set=True)
Hey @gsavant ! Make sure you’ve copied the correct link (either https or ssh depending on your setup). If this still fails, just download the repository as a zip file. This link is available from the same menu:Best of luck with the course! 😄
You chain a lot of methods here and so mocking will be a bit entangled as well. First off, you should not call the methods on the mock when setting a return value:Wrong: client.assets.retrieve().return_value = assetCorrect: client.assets.retrieve.return_value = asset Secondly, you need to think about what is returned by these intermediate chained calls, i.e. you first get an asset object, then you ask for the time series connected to it. This gives you a TimeSeriesList object. On this object, you get the specific time series by using the .get method supplying an identifier etc, etc, etc.You should probably mock these objects separately, i.e.mock_ts_list = ...mock_asset = ...mock_asset.time_series.return_value = mock_ts_listclient.assets.retrieve.return_value = mock_assetHope this have given you a few pointers to mocking! Let me know if you have further follow-up questions.
To upload an actual file, check out:https://cognite-sdk-python.readthedocs-hosted.com/en/latest/cognite.html#upload-a-file-or-directory Or do you want to read the data and upload for example the rows as assets?
Hi Brendan,A little bit of background first: starting with v5 of the SDK, datapoints fetching was changed from JSON to protobuf (initially with version >=4 ) as this is by far the heaviest and most used operation.For compatibility with important packages like TensorFlow and Streamlit (to name a few) which unfortunately still depend on an old version of protobuf (^3.x i.e. <4), we (or rather I) went to great lengths to support both major versions >=3. One of the issues with v3 is that it lacks the compiled C binaries for certain platforms like M1 macs which I’m guessing you are on.To solve this, verify that your project does not have a strict requirement on v3 of protobuf and simply upgrade the version to v4 or higher.poetry add "protobuf>=4"# orpip install "protobuf>=4"In case you must stay on v3, feel free to silence the warning by doing: (it is still plenty fast for small-to-medium sized jobs, i.e. less than a few tens of millions of datapoints)import warningswarnings.
Hi Jason!We are in the middle of updating the course. In the meantime, click on v4 of the Python docs to see the docs that the course was made for.
Today one can place the function’s file in a data set and restrict the code of the function from being altered by others, while the function definition itself cannot. We are looking into how we can improve this and also the experience around controlling access, though not a clear timeline is in place for when it will be available in the product. Also worth noting that after a function has been created, the file object may be deleted.We have an easy-to-use template for Github that helps make deployments to Functions maintainable which does exactly this by default: https://github.com/cognitedata/deploy-functions-oidc
(...) Hi @Håkon V. Treider its already present as your comments. attaching screenshot Seems like it is working now, right? 😄
Seems like you get authorized successfully by AAD. That leads me to think you are missing one or both of:Projects:LIST in scope: all Groups:LIST in scope: all OR scoped to "current user" (i.e. list your own groups)
Why does CDF return DateTimeIndex with dates before my start value?I can safely say it doesn’t. Are you specifying data using `datetime` objects?Edit: I see that you are. Python datetimes are always local time as per the language definiton of them (I guess 95% of people disagree with this, and I am one of them) anyway, we adhere to this. In the guide linked above there are a few ways to solve this, but the easiest is probably this (taken from the official Python SDK docs):from datetime import datetime, timezonestart = datetime(2023, 1, 1, tzinfo=timezone.utc)
Time series data in CDF will always return UTC. In v2 of the service, the feature you are requesting might be available (to request data in a specific timezone).In the meantime, please don’t convert timestamps manually. There are soooooo many non-intuitive traps to fall into. Since it seems you are already using pandas, let it do the conversion for you:
Hi there! Have you tried to do what the error message suggests? 😊Error: The ID token is not yet valid. Make sure your computer’s time and time zone are both correct
Hi Oliver! I would love to clarify the example in the SDK documentation that was not clear. Could you post a screenshot maybe? I am thinking of the part that caused this confusion:From the code in the SDK in GitHub it appears that if the data is exceeding the limit then it would return an empty list, which is a very dangerous return setting. Thanks! 😊
A bit of a weird stack trace, very short? Only file mentioned is from yaml lib, which surely can’t be the cause? From googling around, it seems completely safe to just silence this particular warning, see e.g. https://github.com/boto/boto3/issues/454#issuecomment-324782994warnings.filterwarnings("ignore", category=ResourceWarning, message="unclosed.*<ssl.SSLSocket.*>")
Great, can you post the stack trace please?
Could you run your script like the following (converts warnings to exceptions):>>> python -Werror my_script.pyThat way we might get some more insight into exactly where the warning originates from.
You say you have hourly data, and are missing 2976 - 2928 = 48 datapoints. Are you sure you are not just 2 days off somewhere?
Thanks, I agree with the computed value. The last thing I wonder about is how/if I can be sure that no matter how the original datapoints are distributed with respect to time, the average for hourly aggregates computed at say 01:00 will always be based on the weighted averages over the timespan from [01:00; 02:00). Almost yes; the start and end of the interval can be influence by outside points. I made an illustration to show it better, modifying the illustration in the documentation:Here you have two points inside the interval (green), one point before (red) influencing the part of the average from the left boundary to the first inside datapoint (same and opposite argument for blue / datapoint after the interval).
I am not entirely sure I get your question, but I’ll try to answer as best as I can! The timestamps for aggregates are labelled by the left boundary. That means the period from 03:00 (inclusive) to 04:00 (exclusive) will be labelled as 03:00.In your data above, for the aggregate data point at 03:00, with the value 0.0 is zero because all datapoints inside and the two outside points (for the purpose of interpolation to the boundary) are zero.The aggregate data point, at 04:00, with the value 0.125 can be computed as follows:From 04:00 to 04:30 the value is 0.0, so this part of the interval does not contribute to the final aggregate value. However, in order to find the contribution from the timespan from 04:30 to 05:00, we need to use linear interpolation to find the value at the right boundary, as the documentation specifies. The next value is at 05:30, so this would give an in-between value at the right boundary of 0.5. The average is defined as the time-weigted average (mean distance
Already have an account? Login
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.