Solved

inso-bootstrap-cli is not creating CDF groups

  • 3 February 2023
  • 9 replies
  • 113 views

I have an AAD with a froup and a source in it. I am trying to configure inso-bootstrap-cli’s yml file (bootstrap-cli-config.yml) to create the CDF group and map the AAD group to it.

This is my yml file:

---bootstrap:    features:        with-special-groups: false        with-raw-capability: true        aggregated-level-name: all        group-prefix: test        dataset-suffix: ds        rawdb-suffix: db        rawdb-additional-variants:            - state    idp-cdf-mappings:        - cdf-project: myproject          mappings:              - cdf-group: test:all:FTDM_Users                # the objectid below is the correct id of the AAD group                idp-source-id: 5b20d90a-648f-4005-89df-da9bc3d122b3                idp-source-name: SWC_DATA_FTDM_Users    namespaces:        - ns-name: src          description: Data sources for the extractor          ns-nodes:              - node-name: src:001:fteg                description: Source 01 Data from datacognite:    host: ${BOOTSTRAP_CDF_HOST}    project: ${BOOTSTRAP_CDF_PROJECT}    idp-authentication:        client-id: ${BOOTSTRAP_IDP_CLIENT_ID}        secret: ${BOOTSTRAP_IDP_CLIENT_SECRET}        scopes:            - ${BOOTSTRAP_IDP_SCOPES}        token_url: ${BOOTSTRAP_IDP_TOKEN_URL}logger:    file:        path: ./logs/bootstrap-cli-logs.log        level: INFO    console:        level: INFO

When I run the file using this command it succeeds (no errors)
        poetry run bootstrap-cli deploy ~/workspace/myfolder/config/bootstrap/bootstrap-cli-config.yml

However, when I look in CDF, I see a bunch of stuff created (associated with the namespace config and source, but I don’t see the group for the source AAD group

Is there something wrong with this yml file?

Thanks

Adrian

icon

Best answer by Peter Arwanitis 3 February 2023, 16:47

View original

9 replies

This is a better view of the yml content from the above posting 

---
bootstrap:

features:
with-special-groups: false
with-raw-capability: true
aggregated-level-name: all
group-prefix: test
dataset-suffix: ds
rawdb-suffix: db
rawdb-additional-variants:
- state

idp-cdf-mappings:
- cdf-project: myproject
mappings:
- cdf-group: test:all:FTDM_Users
idp-source-id: 5b20d90a-648f-4005-89df-da9bc3d122b3
idp-source-name: SWC_DATA_FTDM_Users
namespaces:
- ns-name: src
description: Data sources for the extractor

ns-nodes:
- node-name: src:001:fteg
description: Source 01 Data

cognite:
host: ${BOOTSTRAP_CDF_HOST}
project: ${BOOTSTRAP_CDF_PROJECT}
idp-authentication:
client-id: ${BOOTSTRAP_IDP_CLIENT_ID}
secret: ${BOOTSTRAP_IDP_CLIENT_SECRET}
scopes:
- ${BOOTSTRAP_IDP_SCOPES}
token_url: ${BOOTSTRAP_IDP_TOKEN_URL}

logger:
file:
path: ./logs/bootstrap-cli-logs.log
level: INFO
console:
level: INFO

 

 

 

Userlevel 2
Badge

Hi Adrian,

When debugging myself, I always found it helpful to use the Diagram-option of the cli to check for any errors and see if things are connected as I would expect. 

How to use it is explained here:
https://github.com/cognitedata/inso-bootstrap-cli#diagram-command

 

Regarding you case it seems that your 

idp-cdf-mappings is referring to a cdf-group that is no created by the tool, and thus not touched. Using the diagram-tool explained above, you will get a list of the groups that will be created, and you can map one of these to your idp-group.

 

Hope this helps

Sverre 

Userlevel 2
Badge

Hi Adrian,

Just confirming to Sverre's reply.

Diagram really helps, before you get used to the naming schema of the created cdf-groups

In your case some -- but not all -- group-names generated by your config are:

test:all:owner
test:all:read
test:src:001:fteg:owner
test:src:001:fteg:read

Thanks. I changed the groupnames to all:owner and all:read and it works like a charm

I still find it “disappointing” that you’re limited to those names but they’re usable so that’ll work

Userlevel 2
Badge

Hi Adrian,

> Thanks. I changed the groupnames to all:owner and all:read and it works like a charm

Glad to hear :)

 

> I still find it “disappointing” that you’re limited to those names but they’re usable so that’ll work

Could you please explain what is the “disappointing” part?

What would be your ideas for an ideal naming-scheme?

 

---

Some of the constraints which lead to this naming-scheme and example-configuration:

  1. supporting an alpha-num sortable list in CDF Group, RAW DB or Data Set listings (the only reason that three-digit numbers are suggested on node-level, which are not mandatory)
  2. allowing an easy filtering in Fusion UI by a known namespace or node-name
  3. reflect the hierarchical namespace in the name / externalId
  4. keep the naming scheme consistent across CDF Group, RAW Database names and Data Sets
  5. we have to keep the name short as for example RAW DB names are max 32 characters (a validation step help and alerts if names are getting too long)
  6. where “descriptions” are possible you can add longer text

Beside this constraints, you can “express” your project related semantics/terminology, by choosing your namespace / node names.

 

As much feedback we can get from users, as better we can plan our improvements!

regards
Peter
(=PA=)

P.S.: I’ve seen that you configured this part:
group-prefix: test

That’s ok if it is a “test” :) But will block you from deploying the *same* configuration, across multiple projects like dev/test/prod, which only requires `idp-cdf-mapping` changes typically. This prefix was only meant to make it obvious in CDF Group listing, which groups are generated by bootstrap-cli, and to separate it from manual configured ones.

Hi Peter
 

Thanks for that further explanation. Based on the constraints, I appreciate why you limit the groups to “owner” and “read”

Before I respond to your request for what I meant by “disappointing”, let me describe my setup that I was trying to map over to CDF.

My AAD Groups might have names like SuperAdmins, Constributors, Developers. I was then trying to map those to CDF groups with the same names (which would then be given appropriate capabilities to line up with their intended scope). This is when I hit the snag. I can’t map to groups ending in anything other than “owner” or “read”.

What I suppose I am saying I would like to be able to do (notwithstanding the constraints you describe) is to be able to define groupnames in a custom fashion and not be limited to “owner” and “read”.

I understand that you can use Namespace and Nodes to create custom nodes in the namespace but (please correct me if I am mistaken in the following statement) these would not automatically tie up to the associated AAD groups via SourceId. You would have to manually configure the SourceId to map the group to an associated AAD group.

 

Regarding my use of “test”. Yes, it is totally because I was running a basic test that I chose that name. :)
group-prefix: test

 

Thanks
Adrian

Userlevel 2
Badge

Hi Ardian and thank you for taking your time to explain your case, which I have now (hopefully) understood.

Looking forward to your feedback, as I got a bit carried away with explaining and evangelizing the approach :)

 


> let me describe my setup that I was trying to map over to CDF

Migrating an existing setup to a CDF “bootstrap-cli” one, is for sure not an easy task.

Why? Because “bootstrap-cli” comes with an opinionated naming-scheme -- as you are aware now too.

Migrating requires now either to seek for a a compromise or to follow a new approach.

The compromise (or alignment) can happen in my opinion on:

  1. naming your ns/nodes level
  2. freedom of mapping in idp-cdf-mapping (where in an ideal world the AAD Group names follow a similar naming pattern as on the CDF side)
  3. creating specific namespace/nodes only reflecting end user-roles, using the `shared-access` feature to define shared dataset access-control by using the READ/OWNER semantic.
  4. (I only refer to CDF Dataset in my explanations, but it is Data Sets and RAW Databases which are available and used for scoping, with same naming-scheme but configurable suffixes)

The READ/OWNER suffix -- which hit you -- is another core-concept and an “opinionated template”.

Documented here: https://github.com/cognitedata/inso-bootstrap-cli/tree/v2.4.0#templating

It simplifies the complexity of “CDF Group > capabilities > acl > actions” => into two sets:

  • one for READ (actions: read, list, ..) and
  • another for OWNER access (with additional actions:delete, update, ..)
  • without capabilities to escape or break the applied dataset-scoping (like creating new groups or datasets!)
  • defined and applied for all available CDF Resource types (raw, files, assets, timeseries, events, ..)
  • but strictly scoped to the datasets accessible for the CDF Group (ns/nodes)

Without this predefined sets, you can easily drown in the complexity available combining all CDF access-control features.

Bootstrap-cli favors an approach which can be understood and explained to non-cdf-technical people too (stakeholders, IT Security), even after years of extending and scaling.


The Diagram option helps to visualize the symmetry between READ and OWNER CDF Groups, as well as the connection to AAD Groups. “Shared-Access” feature is by design only available for OWNER-type CDF Groups.

The automatically available “super groups” like src:all or top-level all, you have already recognized?)

This diagram was created with a minimum extended config as you have shared, and demos the typical project quick-start using the `all` groups to connect to:

    idp-cdf-mappings:
- cdf-project: myproject
mappings:
- cdf-group: test:all:owner
idp-source-id: 5b20d90a-648f-4005-89df-1111111
idp-source-name: SWC_DATA_FTDM_Admins
- cdf-group: test:all:read
idp-source-id: 5b20d90a-648f-4005-89df-2222222
idp-source-name: SWC_DATA_FTDM_Users
namespaces:
- ns-name: src
description: Data sources for the extractor

ns-nodes:
- node-name: src:001:fteg
description: Source 01 Data
- node-name: src:002:second
description: Source 02 Data
shared-access:
owner:
- node-name: src:001:fteg
Userlevel 2
Badge

> You would have to manually configure the SourceId to map the group to an associated AAD group.

Yes, keeping AAD and CDF groups in sync is a manual job.

CDF and AAD are seen here as two independent systems, with a lose-coupling using idp-cdf-mapping, as it is not feasible to dictate or automate customer side IdP (AAD) from bootstrap-cli end.

Userlevel 2
Badge

> What I suppose I am saying I would like to be able to do (notwithstanding the constraints you describe) is to be able to define groupnames in a custom fashion and not be limited to “owner” and “read”.

To answer this explicit: Only on the AAD side you can name groups in a custom way, on CDF side you can control some parts of the naming, but `:owner` and `:read` are hardcoded and automatically provided for *every* configured node-name.

Reply