Revit file format definition

Hi there,
digging into Revit I recently bought the book “Dynamo and Grasshopper for Revit” by Marcello Sgambelluri.
A couple of dynamo graphs in there are retrieving data from a Revit project by parsing the file per ASCII chars (in the book the examples are for getting the rvt version and builds and worksharing status).
Is there a resource online with the full rvt file format definition where all the keywords are listed?

Maybe a little OT here and perhaps less doable than accessing all such data via Revit API…
Directly in Revit I know there’s even a great plugin by Jeremy Tammik to inspect the Revit project db called Revit Lookup.
But… I’m still curious about the possibility to parse the rvt file and its details.

Thanks in advance for your time.

Rightclick any revit file and choose “open with”
Choose notepad, see what u can find.
I think only the file version and work sharing status is readable data, the rest is binary code.
Just never ever save a revit file opened in notepad

Hi @Marcel_Rijsmus and thank you for your comment.
I thought about notepad or even a more powerful hex editor but I was hoping there were other resources.
You’re right though, not the right approach if all is binary, not to mention saving it (pretty much like saving a MS-DOS formatted csv file edited in excel in a way…yikes!).
Bye

Hello
it is not possible to access the Revit API without a Revit or Forge context.
A Revit project is stored as a compound storage file, (OLE Storage).

You can read some informations from the “BasicFileInfo” (with a Library can read OLE Storage)

some informations here

A example with Python 3 and olefile module

(use translate button on the Blog)

1 Like

Note that this data is not necessarily consistent between builds of Revit.

1 Like

Hi Cyril, merci pour la réponse.
French is fine by me.
Good to hear there’s a way in python to access such data without opening a rvt project!
Hopefully in the future Revit release it will be included the last dynamo core with the ability to use python 3 and external loaded modules.
Bye

1 Like

Hi @jacob.small thank you for pointing that out.
Bye

you can use DynamoStandbox with the CPython3 engine

import sys
import clr
sys.path.append(r'C:\Users\UserName\AppData\Local\python-3.8.3-embed-amd64\Lib')
sys.path.append(r'C:\Users\UserName\AppData\Local\python-3.8.3-embed-amd64\Lib\site-packages')
sys.path.append(r'C:\Users\UserName\AppData\Local\python-3.8.3-embed-amd64\Lib\site-packages\olefile-0.47.dev4-py3.8.egg')
import os
import os.path as op

import olefile
import re

pathrvt_file = IN[0]
def get_rvt_file_version(rvt_file):
    if op.exists(rvt_file):
        if olefile.isOleFile(rvt_file):
            rvt_ole = olefile.OleFileIO(rvt_file)
            bfiLe = rvt_ole.openstream("BasicFileInfo")
            file_infoLe = bfiLe.read().decode("utf-16le", "ignore")
            bfiBe = rvt_ole.openstream("BasicFileInfo")
            file_infoBe = bfiBe.read().decode("utf-16be", "ignore")
            file_info = file_infoBe if "਀" in file_infoLe else file_infoLe
            print(file_info)
            if "Format" in file_info:
                rvt_file_version = re.search(r"Format.+?(\d{4})", file_info).group(1)
            else:
                rvt_file_version = re.search(r"(\d{4}).+Build", file_info).group(1)
            return rvt_file_version, file_info

rvtVersion, fileInFo = get_rvt_file_version(pathrvt_file)

OUT = rvtVersion, sys.version, fileInFo
3 Likes

That’s exactly what i was talking about!
Hoping to get that python engine directly inside Revit too!

The Revit files are OLE Structured Storage. here are som Jupyter lines playing around with the file. One can extract basic file info but the other streams seem difficult to decode. Maybe some who read this sits on some info that may help?

2 Likes