Reading Excel File in Sequence

I am working on a script to read a large number of Excel file data. The data will be processed and write to a single Excel file. It will be a repetitive exercise. Is there a node can automatically open the next Excel in sequence in a folder without manually changing the file destination each time? I am thinking the Excel might need to be named in a certain way.

excelfile1001
excelfile1002
excelfile1003
excelfile1004 . . .

The attached dyn takes a chosen directory and looks for all of the .xlsx files within it and returns their file paths. At this stage, you could sort by filename if desired. If you have variable worksheet names (unlike just using Sheet1 or a similar default) you can query the worksheet names using the Bumblebee package.

The Data.ImportExcel then brings in all of the data from the Excel files. Set showExcel to false to stop it visually opening every file. Depending on how many files, and how large they are, it might take some time. Once it is complete you will have a 3D list where @L3 are the workbooks, @L2 are the rows, and @L1 are the column values. You can then do whatever data manipulation you want. In this simple example, it merely flattens the top level so that the resulting 2D list can be written easily to one worksheet.

The final step is then to write it out to a single workbook.

Hope this helps,
Thomas

Input and Output Excel

[39440.dyn|attachment]
39440.dyn (33.5 KB)

4 Likes

As much as I like Dynamo (for geometry, modelling etc) and this is a Dynamo forum, Dynamo probably isn’t the best tool for this particular task

I’d suggest Pentaho Data Integration (community edition is free)
Same thing can be achieved in 2 nodes


4 Likes

Agreed, it’s a good point. I just wasn’t sure whether the original poster intended the “Do stuff with data” section to incorporate data from Revit models, or other dynamo processes, as opposed to just processing the Excel data

1 Like

I learned something news. Thank you for the new insight.

I did a testing of the nodes. For unknown reasons, the FileSystem.DetDirectoryContents couldn’t find the Excel file from ‘.xlsx’ and '.*’. I am using Revit 2018, Dynamo 2.0.3

2019-08-16_11-27-09

Are the screengrabs shown after running, or without it having been run? The null output is unexpected because the node should return a dictionary consisting of files and directories. Even if there is nothing to be found it should return empty lists within the dictionary so it looks like something is wrong if the graph has been run.

I use 2019 and 2.0.3 but I just tested with 2018 and it was fine

1 Like

It didn’t have any error messages ‘yellow node’. You reminded me I can try 2019 too.

It works on R19! It does what it supposed to do in the output Excel file. Now I need to search interesting data to show in Power BI graphic. Maybe I will use stock data first. Thanks again.

I used the 6-month historical stock price .csv for the exercise. The node read the 5th column on each of the five excel in the directory and write on the new column on the output excel.