I believe we need bring infer_schema_length to db extractor parameter to improve it flexibility.
If I understand correctly polars library was used in db extractor.
According docs:
infer_schema_length: int | None = 100,
this means it defines data types by first 100 rows by default. This is not fit for our client data.
Currently I extract data from csv and experienced with 2 issues. I think they are related to each other. As I understood: Extractor defined column data type based first n (probably 100) rows:
-
could not parse `"1,616,178"` as dtype `f64` at column '*column_name*' (column number 10)This value on the line 3161 could not parse `"2.15"` as dtype `i64` at column '*column_name*' (column number 15)This value on the line 118 - all previous values equal “0”
after error in terminal I see suggestion:
You might want to try:
- increasing `infer_schema_length` (e.g. `infer_schema_length=10000`),
- specifying correct dtype with the `schema_overrides` argument
- setting `ignore_errors` to `True`,
- adding `"1,616,178"` to the `null_values` list.
Check the
documentation
Ask the
Community
Take a look
at
Academy
Cognite
Status
Page
Contact
Cognite Support