Secure data transfer between cplace systems with GraphQL
GraphQL was originally developed by Facebook to simplify the development of APIs for web clients. Instead of providing a specific interface on the server side for each client and use case, with GraphQL it is sufficient to develop a single generic query interface. Detailed tutorials on how to use and implement GraphQL for Web APIs with different programming languages are available at graphql.org.
Use case: data transfer
The following is a slightly more unusual use case, namely the development of a generic component for secure data transfer between cplace systems.
If certain data has to be transferred from a source server to a target server in the cplace context, the mechanism for selecting and transferring the configured data should of course also do without programming, but function generically and configuration-controlled - in other words, “no code”.
This basic principle and additional customer requests resulted in the following requirements for the development of the data transfer component:
- The interface provided by the source server must dynamically follow the currently configured data schema in each case.
- The source server must be able to control which data it passes on to which target servers. This prevents confidential data from being leaked.
- The target server must be able to define its data requirements configuratively via a query.
- It should be possible to define the timing and scope of transfers on a configuration-controlled basis. For example, “stable” master data could be synchronized only on weekends, while “fast-moving” transaction data could be synchronized at shorter intervals.
cplace already offers a component for data transfer called “Cross-Company eXchange (CCX)”. However, the collaboration Factory’s shared-source approach also makes it possible to integrate alternative interface technologies such as GraphQL without further ado. This openness distinguishes cplace from proprietary systems and simplifies the connection of other systems - rapid application development is joined by rapid application integration.
GraphQL as a basic technology
GraphQL works similar to SQL: It provides a language for defining data schemas and for querying and manipulating the corresponding data.
The figure shows an example: The schema on the left of the figure defines, among other things, projects (type “Project”) with two attributes (“name” and “tagline”, each of type “String”) and a relationship (“contributors” to entities of type “User”). Based on this schema, the query “return me the tagline of the project named ‘GraphQL’” can now be formulated, the result of which can be seen on the far right. Both query and result are in JSON format.
Clients can get the current schema via a special GraphQL query. It is then available in JSON form, for example. Changes of the schema are basically possible at runtime. If the changed schema is compatible to the previous schema (for example, if only new data fields are added), already existing client queries remain valid.
The implementation of the component for data transfer between cplace systems with GraphQL results more or less directly from the requirements:
The provision of the current GraphQL schema by the source server requires access to the currently configured server data schema. Thus, for all types, relations and attributes of the cplace data schema, appropriate GraphQL types, relations and attributes have to be created, which allow access at runtime.
In our particular use case, this was quite easy to do, since the data models and base data types of cplace fit well with the corresponding GraphQL concepts.
The remaining problems could be solved by the following techniques:
- For some special cplace data types (e.g. DateTime and certain reference types) there was no equivalent in the GraphQL standard. Therefore, so-called “custom scalars” were created in addition to the existing GraphQL base data types, as already provided by the GraphQL standard.
- The naming of the types of the server data schema could not simply be adopted in GraphQL, because GraphQL, in contrast to cplace, does not support dots and special characters in the names. Therefore, in the generated GraphQL schema, dots were replaced by underscores and special characters (underscores, hyphens, umlauts, etc.) were replaced by HTML codes embedded in underscores (without the leading &). For example, “Abc.DEÄ” in the GraphQL schema and the corresponding query becomes “Abc_DE_auml_”.
To control the outflowing data by the source server, access control has been implemented using access tokens and checking the querying IP address. Data and structures that are not allowed to be transferred can be hidden in the generated GraphQL schema via configuration.
The definition of the query on the target system is also configuration-driven. It is important for the success of the query and the subsequent successful reading of the retrieved data that the data schema of the target server matches the corresponding section of the data schema of the source server. To test this, there is a schema validator that can be used to check the structure and types against the database. Since the data schema can change dynamically at any time, the queries are additionally backed up by appropriate checks and GraphQL error messages.
Finally, a configuration mechanism has been developed to specify transfer times and scopes (via GraphQL queries to be executed).
GraphQL can be learned in principle fast, offers however a depth, which must first be over-looked and then also correctly used. With a few exceptions (especially the support of types with dots or special characters in the name), the concepts, mechanisms and interfaces offered by GraphQL by default were sufficient for us during the implementation. The Java libraries used in the GraphQL environment are also mature.
Particularly helpful for testing and debugging was the GraphiQL graphical query interface, which can be used to interactively check both the generated data schema and sample queries based on it.
Special measures to optimize performance were not necessary for our use case: Up to several hundred thousand data records, the data could be read out in JSON format from the cplace source system, transferred and persisted in the cplace target system in a transaction-driven manner without any problems.
Authors: Sebastian Zimmer (Software Developer), Dr. Klaus Bergner (Managing Director), Holger Spiering (Software Architect)