Skip to main content
Question

Using Pygen to query though relations

  • 3 May 2024
  • 9 replies
  • 133 views

I’m working on a prototype for a flexible data model to store time series data in a way that is easy to catalogue, query and filter. Using Pygen both to populate and use the model seems convenient.

At its current iteration, I’ve only applied direct relations and (undocumented?) @reverseDirectRelations in the GraphQL schema. I expected to be able do something similar to 

client.windmill(windfarm="Hornsea 1").blades(limit=-1).sensor_positions(limit=-1).query()

as found in the Pygen documentation, but it does not work (my client.windmill analouge has no methods corresponding to its relations). Do I have to use edges instead of relations to query easily and declaratively with Pygen?

Yes, that is correct. As you see from the documentation, Pygen supports querying Python based as you are referring above and using GraphQL. The Python based querying has some limitations were this is one of them.

 

I am interested, can you share a bit more about what you are trying to achieve? What type of syntax would you expect to work for the query you are writing?


A subset of the current model sketch would look something like this: 

"""
A production facility, containing many wells
"""
type Hub {
"""
The name of the asset, used as ID.
"""
name: String!

"""
All wells that belong to this hub.
"""
wells: Well] @reverseDirectRelation(throughProperty: "hub")
}

"""
Oil wells
"""
type Well {
"""
Name/ID of the well
"""
name: String!

"""
The hub of which the well is part
"""
hub: Hub

"""
All profiles from this well
"""
profiles: Profile] @reverseDirectRelation(throughProperty: "well")
}

"""
A production profile, i.e. time series data
"""
type Profile {
"""
The well's which production is described
"""
well: Well

"""
The phase described, e.g. oil/gas/water
"""
phase: String!

"""
Actual time series data
"""
time_series: TimeSeries
}

 

Based on that, which at its current iteration is pretty easily populated, one would like to fetch all profiles based on properties of its relations, i.e. the well and hub. 

I would like to be able to do something like

client.profile(phase="Gas").well(limit=-1).hub(name="Hub1").query()

to get all gas profiles in the hub Hub1. 

 

Does that make sense? If I were to use edges instead, how much would I have to change? The actual model has around 15 nodes at the moment and some basic inheritance to describe different types of profiles. Will using edges affect the population and the ‘free’ reverse direct relations? 


So with the query `client.profile(phase="Gas").well(limit=-1).hub(name="Hub1").query()` you expect to get the profiles of the hub Hub1? That is the reverse of how it is working today. That query, if you had edges, would give you all profiles of phase gas, then all wells connected to these, and then all the hubs for the wells connected to all the wells. 

I have been looking for an example like this, so I very much appreciate you sharing this. Will look into how it can be implemented. 

To your last question, yes, if you switch to edges that would affect the population and the ‘free’ reverse direct relations. I think it is better to add support for querying along direct relations for pygen. I have registered both as feature request: Support querying along direct relations, and, querying in reverse. 


Interesting. At Pygen’s current version, what would be best way to filter across relations or potentially relations?

I hope there are or will be better solutions than something like this:

hubs = client.hub.list(name="Hub1")
wells = client.well.list(hub=hubs)
profiles = client.profile(phase="Gas", well=wells)

Can you provide any timeline on querying along relations being implemented?


Yes, it would be something like that. 

Currently, the highest priority of pygen is to robustify it against bugs and not new features. So it is currently hard to give any concrete estimate on how long it is going to take. 


Fair enough. What do you think of going for a more denormalised model vs. relying on GraphQL queries?


Hard to say without full context, but I would maybe use GraphQL queries. 


We are using FDM and our code is exclusively python so I’m waiting with bated breath for a more elegant way of getting related objects linked across types through pygen. 

 

I have used CYPHER before, the langualge for Neo4j and for property graphs, it has very fast and easy to understand querying. Cypher queries are short and powerful and don’t oblige the user to name the types of all connecting types. 

 

// Query to find all motors linked to a specific line
MATCH (line:Line {number: 'lineNumber'})->()--:HAS_MOTOR]->(motor:Motor)
RETURN DISTINCT motor

// Query to find only conveyor-linked motors for a specific line
MATCH (line:Line {number: 'lineNumber'})->()--:HAS_CONVEYOR]->()--:HAS_MOTOR]->(motor:Motor)
RETURN motor

 

We are modelling industrial propcessing lines. Our line may have ovens, which may have a conveyor, which may have a motor. In the same way, turbines in the ouras of our oven may also have motors. I would like to see ways to list ALL the motors linked to my line as well as to see only the conveyor linked motor, for a particular line or for a particular oven.

 

Here is how I imagine the pygen linked functionality to work.

 

All motors for a particular line or oven
client.motors(line = ‘XXXX’).list() 

or

client.motors(oven = ‘XXXX’).list() 

 

Only the ones linked to a conveyor (directly or indirectly)

client.conveyor.motor.list(line = ‘XXXX’).list()

 


Thanks for that input  APSHANKAR Sagar. Will use the feedback in the implementation of updated query functionality. 


Reply