Create Graph Structures from deeply nested JSON Documents

Recently, we did some tests with the current Structr version to see how well Structr can handle deeply nested JSON documents to create graph structures in Neo4j. We found out that it worked quite well up to the second level of object nesting, and only if new objects referenced existing objects either by their UUID or by a scalar value, if the corresponding mapping was defined in the schema.

When it comes to attributes of nested objects being referenced by more complex values, like collections or nested objects themselves, it became difficult in the sense that you had to define a rather complex object mapping in the schema, using things like Notion attributions (using PropertySetNotion internally).

What's new?

Over the past days, Christian added some improvements to mitigate these issues, making it much easier to create nodes and relationships in Neo4j based on the schema rules in Structr.

Example

To demonstrate the new capabilities, we've created the following example.

Schema

The schema rules in Cypher notation:

(:Project)-[:TASK]->(:Task)
(:Task)-[:SUBTASK]->(:Task)
(:Task)<-[WORKS_ON]-(:Worker)
(:Worker)->[:WORKS_AT]->(:Company)

The cardinalities are:

  • Project 1 -> * Task
  • Task 1 -> * Task
  • Worker 1 -> * Task
  • Worker * -> 1 Company

To make the example work, it is important to overwrite the auto-naming and name the attributes exactly as in the example: tasks, parentTask, subtasks, company, worker, workers etc..

Make sure the name attribute is unique for each type.

Auto-creation Rules

Instead of using a PropertyNotion with autocreate flag, you only have to define the following auto-creation rules (ALWAYS) in the Schema Editor:

  • Project -> Task
  • Task -> Task
  • Worker -> Task
  • Worker -> Company

JSON Document

The following example JSON document contains all information about the objects to be created. Note that some sub-objects occur multiple times in the document, and if there's a unique attribute defined for their type (like e.g. the name attribute for Project, Company and Worker), the object is only created once.

{
   "name": "Project1",
   "tasks": [
       {
           "name": "Task1",
           "worker": {
               "name": "Worker1",
               "company": { 
                   "name": "Company1"
               }
           },
           "subtasks": [
               {
                   "name": "Subtask1.1",
                   "worker": {
                       "name": "Worker1",
                       "company": { 
                           "name": "Company1"
                       }
                   }
               },
               {
                   "name": "Subtask1.2",
                   "worker": {
                       "name": "Worker2",
                       "company": { 
                           "name": "Company1"
                       }
                   }
               },
               {
                   "name": "Subtask1.3",
                   "worker": {
                       "name": "Worker2",
                       "company": { 
                           "name": "Company1"
                       }
                   }
               },
               {
                   "name": "Subtask1.4",
                   "worker": {
                       "name": "Worker3",
                       "company": { 
                           "name": "Company2"
                       }
                   }
               }
           ]
       },
       {
           "name": "Task2",
           "worker": {
               "name": "Worker2",
               "company": { 
                   "name": "Company1"
               }
           }
       },
       {
           "name": "Task3",
           "worker": {
               "name": "Worker3",
               "company": { 
                   "name": "Company2"
               }
           }
       },
       {
           "name": "Task4",
           "worker": {
               "name": "Worker4",
               "company": { 
                   "name": "Company3"
               }
           },
           "subtasks": [
               {
                   "name": "Subtask4.1",
                   "worker": {
                       "name": "Worker4",
                       "company": { 
                           "name": "Company3"
                       }
                   }
               },
               {
                   "name": "Subtask4.2",
                   "worker": {
                       "name": "Worker4",
                       "company": { 
                           "name": "Company3"
                       }
                   }
               },
               {
                   "name": "Subtask4.3",
                   "worker": {
                       "name": "Worker4",
                       "company": { 
                           "name": "Company3"
                       }
                   }
               },
               {
                   "name": "Subtask4.4",
                   "worker": {
                       "name": "Worker5",
                       "company": { 
                           "name": "Company3"
                       }
                   }
               }
           ]
       },
       {
           "name": "Task5",
           "worker": {
               "name": "Worker5",
               "company": { 
                   "name": "Company3"
               }
           },
           "subtasks": [
               {
                   "name": "Subtask5.1",
                   "worker": {
                       "name": "Worker4",
                       "company": { 
                           "name": "Company3"
                       }
                   },
                   "subtasks": [
                       {
                           "name": "Subtask5.1.1",
                           "worker": {
                               "name": "Worker4",
                               "company": { 
                                   "name": "Company3"
                               }
                           }
                       },
                       {
                           "name": "Subtask5.1.2",
                           "worker": {
                               "name": "Worker4",
                               "company": { 
                                   "name": "Company3"
                               }
                           }
                       }
                   ]
               },
               {
                   "name": "Subtask5.2",
                   "worker": {
                       "name": "Worker4",
                       "company": { 
                           "name": "Company3"
                       }
                   },
                   "subtasks": [
                       {
                           "name": "Subtask5.2.1",
                           "worker": {
                               "name": "Worker4",
                               "company": { 
                                   "name": "Company3"
                               }
                           }
                       },
                       {
                           "name": "Subtask5.2.2",
                           "worker": {
                               "name": "Worker4",
                               "company": { 
                                   "name": "Company3"
                               }
                           }
                       }
                   ]
               }
           ]
       }
   ]
}

Just save this document to a file and POST it to the /projects REST endpoint:

$ curl -i -HX-User:admin -HX-Password:admin "http://0.0.0.0:8082/structr/rest/projects" -XPOST -d @/tmp/project.json
HTTP/1.1 100 Continue

HTTP/1.1 201 Created
Content-Type: application/json; charset=utf-8
Set-Cookie: JSESSIONID=rxjhtzyxetnj1l8dx6vg8aejq;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Location: http://0.0.0.0:8082/structr/rest/projects/fafeada746ef432196ee2ccfc7e362fc
Vary: Accept-Encoding, User-Agent
Content-Length: 121
Server: Jetty(9.1.4.v20140401)

{
  "result_count": 1,
  "result": [
    "fafeada746ef432196ee2ccfc7e362fc"
  ],
  "serialization_time": "0.000226953"
}

The complete graph was created, without creating any redundancy!

curl -i -HX-User:admin -HX-Password:admin "http://0.0.0.0:8082/structr/rest/projects/fafeada746ef432196ee2ccfc7e362fc/ui"
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Set-Cookie: JSESSIONID=el6ri37v61wzeuoni7ilgrl0;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Vary: Accept-Encoding, User-Agent
Content-Length: 1222
Server: Jetty(9.1.4.v20140401)

{
   "query_time": "0.001580893",
   "result_count": 1,
   "result": {
      "id": "fafeada746ef432196ee2ccfc7e362fc",
      "name": "Project1",
      "owner": {
         "id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
         "name": "admin"
      },
      "type": "Project",
      "createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
      "deleted": false,
      "hidden": false,
      "createdDate": "2014-12-15T17:47:13+0100",
      "lastModifiedDate": "2014-12-15T17:47:13+0100",
      "visibleToPublicUsers": false,
      "visibleToAuthenticatedUsers": false,
      "visibilityStartDate": null,
      "visibilityEndDate": null,
      "tasks": [
         {
            "id": "517c1a89e44f479eb0802b9045271b4c",
            "name": "Task1"
         },
         {
            "id": "dace6757bad94aa0a137420741406699",
            "name": "Task2"
         },
         {
            "id": "097729f47768469ebeaacd00ea8a442e",
            "name": "Task3"
         },
         {
            "id": "27bfdc1bb293458eab0d912811f610da",
            "name": "Task4"
         },
         {
            "id": "b762fd8f2fe24d149d4a220412c56f49",
            "name": "Task5"
         }
      ]
   },
   "serialization_time": "0.000166250"
}

curl -i -HX-User:admin -HX-Password:admin "http://0.0.0.0:8082/structr/rest/companies/ui"
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Set-Cookie: JSESSIONID=1mc9hmhm2umrm1h60h1vt56o7c;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Vary: Accept-Encoding, User-Agent
Content-Length: 2668
Server: Jetty(9.1.4.v20140401)

{
   "query_time": "0.002427711",
   "result_count": 3,
   "result": [
      {
         "id": "bc9de3a97bce409da5234e5976355aa9",
         "name": "Company1",
         "owner": {
            "id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
            "name": "admin"
         },
         "type": "Company",
         "createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
         "deleted": false,
         "hidden": false,
         "createdDate": "2014-12-15T17:47:12+0100",
         "lastModifiedDate": "2014-12-15T17:47:12+0100",
         "visibleToPublicUsers": false,
         "visibleToAuthenticatedUsers": false,
         "visibilityStartDate": null,
         "visibilityEndDate": null,
         "workers": [
            {
               "id": "2da31b7175cb44c787ff93fe43c7e317",
               "name": "Worker1"
            },
            {
               "id": "d980f56aacea4a0bb2cb8cf7b3a740d5",
               "name": "Worker2"
            }
         ]
      },
      {
         "id": "8fac823ef76f436c96cfc9e0c4c21fb5",
         "name": "Company2",
         "owner": {
            "id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
            "name": "admin"
         },
         "type": "Company",
         "createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
         "deleted": false,
         "hidden": false,
         "createdDate": "2014-12-15T17:47:12+0100",
         "lastModifiedDate": "2014-12-15T17:47:12+0100",
         "visibleToPublicUsers": false,
         "visibleToAuthenticatedUsers": false,
         "visibilityStartDate": null,
         "visibilityEndDate": null,
         "workers": [
            {
               "id": "a9a23f57b9ce4e33bcc1efbfd2537164",
               "name": "Worker3"
            }
         ]
      },
      {
         "id": "eb34f6449d69484f93c69320fe95ea24",
         "name": "Company3",
         "owner": {
            "id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
            "name": "admin"
         },
         "type": "Company",
         "createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
         "deleted": false,
         "hidden": false,
         "createdDate": "2014-12-15T17:47:13+0100",
         "lastModifiedDate": "2014-12-15T17:47:13+0100",
         "visibleToPublicUsers": false,
         "visibleToAuthenticatedUsers": false,
         "visibilityStartDate": null,
         "visibilityEndDate": null,
         "workers": [
            {
               "id": "eadcb538f90a41838f0196fd74b82037",
               "name": "Worker4"
            },
            {
               "id": "8f8bd3613aa543ecae222da22cdd2e14",
               "name": "Worker5"
            }
         ]
      }
   ],
   "serialization_time": "0.000221961"
}

curl -i -HX-User:admin -HX-Password:admin "http://0.0.0.0:8082/structr/rest/workers/ui"
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Set-Cookie: JSESSIONID=1b6g61a7uo1t016croq1fz2x41;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Vary: Accept-Encoding, User-Agent
Content-Length: 6253
Server: Jetty(9.1.4.v20140401)

{
   "query_time": "0.002056031",
   "result_count": 5,
   "result": [
      {
         "id": "2da31b7175cb44c787ff93fe43c7e317",
         "name": "Worker1",
         "owner": {
            "id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
            "name": "admin"
         },
         "type": "Worker",
         "createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
         "deleted": false,
         "hidden": false,
         "createdDate": "2014-12-15T17:47:12+0100",
         "lastModifiedDate": "2014-12-15T17:47:12+0100",
         "visibleToPublicUsers": false,
         "visibleToAuthenticatedUsers": false,
         "visibilityStartDate": null,
         "visibilityEndDate": null,
         "company": {
            "id": "bc9de3a97bce409da5234e5976355aa9",
            "name": "Company1"
         },
         "tasks": [
            {
               "id": "9988348105e34b1ab5d365f4e4f7262a",
               "name": "Subtask1.1"
            },
            {
               "id": "517c1a89e44f479eb0802b9045271b4c",
               "name": "Task1"
            }
         ]
      },
      {
         "id": "d980f56aacea4a0bb2cb8cf7b3a740d5",
         "name": "Worker2",
         "owner": {
            "id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
            "name": "admin"
         },
         "type": "Worker",
         "createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
         "deleted": false,
         "hidden": false,
         "createdDate": "2014-12-15T17:47:12+0100",
         "lastModifiedDate": "2014-12-15T17:47:12+0100",
         "visibleToPublicUsers": false,
         "visibleToAuthenticatedUsers": false,
         "visibilityStartDate": null,
         "visibilityEndDate": null,
         "company": {
            "id": "bc9de3a97bce409da5234e5976355aa9",
            "name": "Company1"
         },
         "tasks": [
            {
               "id": "196189bde51a43afb587563fb47fda91",
               "name": "Subtask1.2"
            },
            {
               "id": "bd35947338924daa85b13391083551b1",
               "name": "Subtask1.3"
            },
            {
               "id": "dace6757bad94aa0a137420741406699",
               "name": "Task2"
            }
         ]
      },
      {
         "id": "a9a23f57b9ce4e33bcc1efbfd2537164",
         "name": "Worker3",
         "owner": {
            "id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
            "name": "admin"
         },
         "type": "Worker",
         "createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
         "deleted": false,
         "hidden": false,
         "createdDate": "2014-12-15T17:47:12+0100",
         "lastModifiedDate": "2014-12-15T17:47:12+0100",
         "visibleToPublicUsers": false,
         "visibleToAuthenticatedUsers": false,
         "visibilityStartDate": null,
         "visibilityEndDate": null,
         "company": {
            "id": "8fac823ef76f436c96cfc9e0c4c21fb5",
            "name": "Company2"
         },
         "tasks": [
            {
               "id": "779b76705c9b43d598ce971024743b13",
               "name": "Subtask1.4"
            },
            {
               "id": "097729f47768469ebeaacd00ea8a442e",
               "name": "Task3"
            }
         ]
      },
      {
         "id": "eadcb538f90a41838f0196fd74b82037",
         "name": "Worker4",
         "owner": {
            "id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
            "name": "admin"
         },
         "type": "Worker",
         "createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
         "deleted": false,
         "hidden": false,
         "createdDate": "2014-12-15T17:47:13+0100",
         "lastModifiedDate": "2014-12-15T17:47:13+0100",
         "visibleToPublicUsers": false,
         "visibleToAuthenticatedUsers": false,
         "visibilityStartDate": null,
         "visibilityEndDate": null,
         "company": {
            "id": "eb34f6449d69484f93c69320fe95ea24",
            "name": "Company3"
         },
         "tasks": [
            {
               "id": "8b2dbfcc93b94e428052ff1276991e34",
               "name": "Subtask4.1"
            },
            {
               "id": "f4b99c128cc24c518e6f9af5c3affea4",
               "name": "Subtask4.2"
            },
            {
               "id": "e6156b8b566e45349e671229ff70f9ea",
               "name": "Subtask4.3"
            },
            {
               "id": "27bfdc1bb293458eab0d912811f610da",
               "name": "Task4"
            },
            {
               "id": "5e45f0d22ee944ec84bc31aa75b40dda",
               "name": "Subtask5.1.1"
            },
            {
               "id": "2e21cad09d09423dacec07abcc763c3f",
               "name": "Subtask5.1.2"
            },
            {
               "id": "ad2e56c80ea944808b7082cdcab9f659",
               "name": "Subtask5.1"
            },
            {
               "id": "6173bc32066b40498147630d92c17990",
               "name": "Subtask5.2.1"
            },
            {
               "id": "83889c6a55f043c6bd69dc04614ff76f",
               "name": "Subtask5.2.2"
            },
            {
               "id": "e04e9f77fa444c40904b474f27bcdc61",
               "name": "Subtask5.2"
            }
         ]
      },
      {
         "id": "8f8bd3613aa543ecae222da22cdd2e14",
         "name": "Worker5",
         "owner": {
            "id": "f02e59a47dc9492da3e6cb7fb6b3ac25",
            "name": "admin"
         },
         "type": "Worker",
         "createdBy": "f02e59a47dc9492da3e6cb7fb6b3ac25",
         "deleted": false,
         "hidden": false,
         "createdDate": "2014-12-15T17:47:13+0100",
         "lastModifiedDate": "2014-12-15T17:47:13+0100",
         "visibleToPublicUsers": false,
         "visibleToAuthenticatedUsers": false,
         "visibilityStartDate": null,
         "visibilityEndDate": null,
         "company": {
            "id": "eb34f6449d69484f93c69320fe95ea24",
            "name": "Company3"
         },
         "tasks": [
            {
               "id": "7785c5445d2147578bc8831797241e53",
               "name": "Subtask4.4"
            },
            {
               "id": "b762fd8f2fe24d149d4a220412c56f49",
               "name": "Task5"
            }
         ]
      }
   ],
   "serialization_time": "0.000484448"
}

Isn't that fascinating? :-)

How it works

Workflow

The parser (GSON) creates a nested structure of maps (JsonInput) from the JSON, which are recursively matched against the schema rules, starting from the innermost object. Structr uses a so-called "DeserializationStrategy" to find out whether a nested object already exists in the graph (and can therefore be linked directly), or whether it should be created according to the autocreation rules in the schema.

Recursive evaluation

When parsing the above JSON document, Structr looks for a Project with the name `Project1` and an array of Tasks with the names `Task1` to `Task5`. To look up the first Task with name `Task1`, Structr recursively calls the DeserializationStrategy to obtain the desired Task, and does this again for the Worker and the Workers' Company. Since the Company entity has no more nested elements, the recursion stops and Structr looks for a Company with the name `Company1` in the database. The company does not exist, so the autocreation settings cause it to be created and returned to the previous recursion level, where it is linked to the newly created Worker with the name `Worker1`.

This process recursively creates new entites, or fetches them from the database if they already exist, mapping the nested JSON document to a graph structure according to the schema rules.

Structr as a Document Database

The described feature will greatly enhance the document database capabilities of Structr, and be part of the upcoming 1.1 release.

You can find the test code for this particular example behind the following link:

https://github.com/structr/structr/blob/master/structr-rest/src/test/java/org/structr/rest/document/DocumentTest.java

Share this post

Comments