Reading geoJSON newline-delimited file on Dynamo

Hi,

I have a geoJSON newline-delimited data file. It was created from QGIS, an OSM file with a height attribute in it. I want to parse that in Dynamo, create the polylines from the " geometry[coordinates] " column of the file to create buildings footprints, and extrude each building footprint by creating polycurve and extruding them according to the data in the “heights_max” column".

the file is like this: (data string):

{ “type”: “Feature”, “properties”: { “full_id”: “w1097381425”, “osm_id”: “1097381425”, “osm_uid”: “8150490”, “building”: null, “Heights_mean”: -3.8323120390415735, “Heights_min”: -6.0, “Heights_max”: 12 }, “geometry”: { “type”: “Polygon”, “coordinates”: [ [ [ 29.1476383, 40.8478408 ], [ 29.1476134, 40.8477578 ], [ 29.1476489, 40.8477517 ], [ 29.1476738, 40.8478347 ], [ 29.1476383, 40.8478408 ] ] ] } }

so I have searched how to parse geoJSON file and try to parse the file like this.

import json
data = json.loads(datastring)
data['geometry']["coordinates"]

so I tried writing this in Python script that works with a string input where I copied and pasted the contents of the geoJSON file

# Load the Python Standard and DesignScript Libraries
import sys
import clr
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *

import json

# The inputs to this node will be stored as a list in the IN variables.
Text=IN[0]

# Place your code below this line
data = json.loads(Text)

# Assign your output to the OUT variable.
OUT =data["geometry"]["coordinates"]

but I couldn’t parse it. Could you help me to parse them?

ps: it is also said in the StackOverflow forum that using geopandas package may be better but I guess installing that package in Dynamo would be much harder way to parse it.

I tried using dataJSON package in dynamo.
I tried reading file like csv but get too many columns by that way.
Should I import the geoJSON file in Excel and convert it to CSV?

parsejson2.dyn (50.2 KB)

To help directly we would need the json file as well, otherwise we have to guess at what the data should be.

please_change_dyn_to_json.dyn (67.1 KB)

Hi,
I changed the extension to dyn from json to add here. Thank you.

1 Like

The json file is not correctly formatted, it is a series of json strings, one per line
You can iterate over the file like this

json_file = IN[0]

result = []
with open(json_file, "r") as jf:
    for line in jf:
        result.append(json.loads(line))

OUT = result
1 Like

thank you very much!

2 Likes

There’s also this answer on Stack Overflow that is more fault tolerant, however requires a ‘readable’ object such as a file object

1 Like

Thank you again @Mike.Buttery @jacob.small

Today, when I tried to open the dyn file, Dynamo said there was an unknown error reading the file. Because the file contains "NodeLibraryDependencies": null after I worked. I changed it to this "NodeLibraryDependencies": [] then it is opened.

I am copying the script in case I need that again in the future.

# Load the Python Standard and DesignScript Libraries
import sys
import clr
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *

import json

# The inputs to this node will be stored as a list in the IN variables.

json_file = IN[0]

# Place your code below this line

result = []
with open(json_file, "r") as jf:
    for line in jf:
        result.append(json.loads(line))

# Assign your output to the OUT variable.

OUT = result

I am also writing Mr Buttery’s other suggestion as a note, in case this one is needed in the future. But I couldn’t run this script, I need to work again for this one.

# Load the Python Standard and DesignScript Libraries
import sys
import clr
clr.AddReference('ProtoGeometry')
from Autodesk.DesignScript.Geometry import *

# The inputs to this node will be stored as a list in the IN variables.

fn = IN[0]

# Place your code below this line

def stream_read_json(fn):
    import json
    import re
    start_pos = 0
    with open(fn, 'r') as f:
        while True:
            try:
                obj = json.load(f)
                yield obj
                return
            except ValueError as e:
                f.seek(start_pos)
                end_pos = int(re.match('Extra data: line \d+ column \d+ .*\(char (\d+).*\)',
                                    e.args[0]).groups()[0])
                json_str = f.read(end_pos)
                obj = json.loads(json_str)
                start_pos += end_pos
                yield obj

# Assign your output to the OUT variable.

OUT =