Search for row content in RAW Explorer

Related products: Transformations and RAW

When using RAW Explorer I find it not possible to search for content in the table. E.g. I cannot search for content in a given column. This means I have to download the table to Excel and do the search there (which works perfectly fine)… It is a shame that data exploration seems to be better in Excel in this case!

 

 

Hi Benjamin, 

What is the use case / scenario that requires you to be able to search within RAW please?

Thanks, Glen


So let’s e.g. say that the rows represents information about documents.

I want to:

  • Validate that a specific document actually is in the table
  • For this document review the information

In this case, this table is used to store information from a solution we have built with f25e. In other cases it could be information related to transformations. Etc. 


OK, so it’s a data onboarding workflow, where the data in each row relates to a document, I guess the rows end up being persisted as metadata?

To validate whether a doc is referenced in the table, you’re typically only interested in one column then?  

Are there other use cases that fall outside of this pattern?

 


In this case the data has nothing to do with metadata. It is just a “state store” table containing information about how the document is being processed by our f25e solution. 

Typically yes, being able to search by rows in column would be very helpful. Se e.g. I want to search for document ID abc in column Doc ID. To do that I cen search in the column and all rows matching to that search appears. (Think like Excel.)

The use case I guess is quite general - where a user simply wants to check specific data in RAW. Typically that could be to verify that specific samples of data exists and exists correctly. 

 


Ah ok, so you’re using RAW as a persistent store for this data then?


I guess you can say that in this case yes. But other times it might be used as part of the ETL process as part of making data available. I don't quite understand how this is relevant for the use case, in both cases validation of data still applies. 


The questions are to better my understanding of the problems you face and the use cases.  When RAW is used as a persistent store, it’s not a pattern we want to optimise for in the future, but it remains a product need within CDF.  Your point about the need for data validation in staging is well taken.


To further explain, if in future we want to make a persistent store that supports the use case you’re describing (so that RAW is not used for that purpose) then it’s important we understand the usage patterns.


This idea was moved to our long term backlog. The behavior requested here is not inline on how Raw was designed to work, however we acknowledge the gap of such functionality in CDF.