Hi
For a dashboard use case we are working on I want to extract a list of the column names in each raw table we have in our staging area. At the moment, it does not seem to be a way of doing this. I have made two very hacky ways of accessing this information (see the code example below), but they are either very time consuming because of inferring the raw schema, or it does not return anything, because the table has two many columns and it times out. This makes this method unfeasible when running the scripts for our whole environment, which would need to happen regularly. I feel like there has to be a better way of doing this. I know raw is a schemaless service, but the columns do exist. Having this information would greatly improve our efforts in getting a better overview of our data.
from pydantic import BaseModel, Field
class RawTable(BaseModel):
database: str
table: str
def to_friendly_name(self) -> str:
return f"{self.database}.{self.table}"
def get_inferred_raw_schema(self, cognite_client) -> Dict[str, Any]:
schema = cognite_client.transformations.preview(
query=f"select * from `{self.database}`.`{self.table}` limit 100"
)
return schema.schema.dump()
def get_raw_schema_from_profiler(self, cognite_client) -> List[str]:
res = cognite_client.post(
url="/api/v1/projects/[INSERT_PROJECT]/profiler/raw",
json={"database": self.database, "table": self.table, "limit": 1000}
)
return list(res.json()["columns"].keys())
Thank you!
Sebastian
Check the
documentation
Ask the
Community
Take a look
at
Academy
Cognite
Status
Page
Contact
Cognite Support