Skip to content

Semantic Data Interconnect (SDI) Data Query Service

Idea

The Semantic Data Interconnect (SDI) Data Query Service can be used to create, update, retrieve, delete and execute queries (JSON based and native SQL).

Access

For accessing this service, you need to have the respective roles listed in SDI roles and scopes.

SDI Data Query Service APIs can be accessed in either of the following ways:

  1. Using REST client: Users of SDI admin role can create, update, delete and retrieve queries from SDI Data Query APIs. In future, users would also get access to in-built query editors.
  2. Using Apps: Analytical applications built on SDI can use queried data using query results. The query results API can be integrated with analytical applications.

Application users can access the REST APIs using REST Client. Depending on the APIs, users will need different roles to access SDI Data Query Service. SDI Data Query Service APIs can be accessed by SDI admin role or technical user roles. The only exception is Query Results API can be accessed by SDI admin, technical user and SDI user roles.

Note

Access to SDI Data Query Service APIs are protected by MindSphere authentication methods, using OAUTH credentials.

Basics

You can create and execute business queries as well as physical queries using SDI Data Query Service. Business queries are validated based on selected semantic model and physical queries are validated based on physical schemas of data within a tenant. Users need to create queries by using custom JSON query format or native SQL query format. For more information about how to formulate queries, refer How to create queries.

SDI provides state-of-art native SQL query engine to query data across disparate sources. Users can now leverage in-built query functionality to gain insights into semantically correlated data. Queries can be either Physical-schema based or semantic based queries. They can be static queries created by data analysts & consumed by end-users or dynamic queries can be created which help users to create multiple execution requests with varied range of input parameters on a single query.

This helps users to dynamically provide range of filter values on which correlated queried data results can be consumed. Further, using standard Native SQL query functions different aggregate functions can be used to create different KPIs analytics and will support variety of use-cases.

Features

SDI supports custom Native SQL queries & JSON based queries. Native SQL queries format support both static & dynamic type of queries. Dynamic queries are parameterized queries where dynamic filter inputs can be provided and aliases can be provided to query result attributes. Operators like Where, inner join, select, order by, from, etc. are supported. Query service supports both semantic queries and physical schema-based queries. If you have created semantic queries based on incomplete semantic model, SDI allows the users to save queries but to get query results; user need to update all the details in the semantic models. Physical schema-based queries are validated against physical schemas. Only validated queries can be saved and are validated against the provided schemas.

Custom JSON query format supports only static queries.

Native SQL based queries supports both static and dynamic (Parameterized queries and alias for query result column names) queries.

In a SQL query, the column values can be defined either in where clause or having clause. If the user provides values for column attributes during query creation, then that query is considered as static query. Users cannot create execution Job requests for the static query, since the system creates it after the query creation. If the user updates the query after validation, the system re-executes the query.

In case of query is categorized as dynamic query (isDynamic=TRUE), if a user wants to provide different values for parameters at the time of execution or needs to get different alias (display name) for columns in the select clause, then the user can set the isDynamic flag to true and provide values for parameters and aliases at the time of execution of job creation. The user can create multiple execution jobs.

If dynamic query gets updated, then all the execution jobs will be marked as obsolete. The user will get outdated results against obsolete execution jobs. The user then needs to create a new execution job to get new results.

If data ingest happens on the affected schemas of corresponding queries, the system will validate all the affected queries due to schema change.
If validation fails, then

  • In case of static query, the user will not receive the result of execution. In case of valid query, the user can see outdated results for few seconds and then new results.
  • In case of dynamic query, all the execution jobs will be marked as obsolete and the user gets the outdated results against those ids.

If query validation is successful, then

  • All the queries will be re-executed and the user will get the current results post execution. For all the queries wherein the execution is not complete, the user will get outdated results.

SDI validates every query created by the user. If the query is not executable, then SDI provides "executable" flag to False. This helps the users to understand that the query cannot be executed. To modify the query, the user can analyze the "pendingActions" provided for users to modify the query. Once the "executable" flag is TRUE, then the user can get the results with status of execution. Currently, SDI provides "current", "outdated" and "obsolete" for query results provided.

SDI currently supports UTC format dates (for example, 2020-02-15T04:46:13Z) and W3C format dates (for example, 2020-10-15T04:46:13+00:00) in the queries.

Note

Native SQL should be encoded using base64 encoder before adding to request

Limitations

  • The timeout period for Query Results and ExecutionJob Results API is 60 seconds.
  • The maximum concurrent request for SDI is restricted to 100 for each tenant.
  • The maximum number of queries that can be stored in SDI is 100 for each tenant.
  • There will be soft-limit and hard-limit to number of executionJobs for every tenant. For more details, contact MindSphere.
  • There will be soft-limit and hard-limit to number of select columns in the query. For more details, contact MindSphere.
  • The maximum payload for query results response can be 250 MB or 1 million row of records. The applicable limit is whichever hits first.
  • In where clause, if column value is a "string", then the user needs to pass the value within single quotes '{value}'.
  • The user can only create select queries.

Native SQL based query Limitations

  • In the native SQL query, if schema name or property name contains any special character, then enclose it with backtick(`).
  • The current version does not support following SQL operations: top, select * , ALL, ANY, COUNT( * )
  • FULL OUTER JOIN is not supported, instead user can use FULL JOIN as FULL OUTER JOIN and FULL JOIN are the same.
  • In queries having BETWEEN function, user should not use 'BETWEEN #{date}# AND #{date}#' format, instead use BETWEEN '{date}' AND '{date}' format.
  • In queries for string concatenation use CONCAT function.

JSON based query Limitations

  • If schema names have a special character “.”, then the user need to use "\"{schemaname}\"" to mention schema name or columns. This needs to be followed for the entire query. Partial use of "\"{schemaname}\"" for schema names or columns results in an error.
  • JSON format queries currently does not support the following SQL operations:
    • IN , NOT IN, BETWEEN, LIMIT, LIKE, MIN, MAX, SUM, AVG
    • COUNT, Distinct
    • NOT
    • IS NULL
    • Alias
    • Left join , right join , full join , outer join, exists, self-join
    • GROUP BY , HAVING
    • union , subquery, all, intersect
    • Insert into , select top

Note

Use of Native SQL query will overcome JSON based query limitations.

For more information, refer Release notes.

Example Scenario

Application developers wants to integrate query results API into dashboard application to receive queried data. Data is ingested into SDI from Enterprise Resource Planning (ERP) and MindSphere IoT data after which schemas are generated automatically by SDI. Data analysts can formulate queries on basis of semantic model or different schemas from data catalog. Each query is uniquely identified by queryid. Application developers can use queryid and integrate query results API into dashboard application to receive queried data.

Further, dynamic query functionality help users to create multiple execution requests on a single query. For eg: If user wants to create a dashboard that exhibits information about plant performance from different locations. Then app developers can create queries to provide same data attributes for different plants and value of location can be sent dynamically over a different execution requests.

This helps users to dynamically provide range of filter values on which correlated queried data results can be consumed. Finally, standard Native SQL query engine provide ability to create different aggregate functions in the query that can be used to create different KPIs analytics and will support variety of use-cases to gain insights from data from disparate sources.

Any questions left?

Ask the community


Except where otherwise noted, content on this site is licensed under the MindSphere Development License Agreement.