List files and folders of a remote zip archive without decompression

Hey all,
I want to list files and folders of a remote zip archive without decompressing it.

I did this for local zip archives with the Python Source node and following code:

import zipfile
import pandas as pd

measurements = []
with zipfile.ZipFile(flow_variables['location'], 'r') as f:
    for entry in (f.namelist()):
        if entry.endswith(".d/"):
            measurements.append(entry)

output_table = pd.DataFrame(measurements, columns=["measurements"])

Do you have any idea, how to do this? I don’t want and also can’t decompress the archives in order to get the list of files and folders. One file has around 600 GB.

Thanks in advance,
Johann

Hello @wurz,

I have tried List Files/Folders node with local zip and files are listed as expected. This node also features File System Connection port so hopefully it should also work on a remote zip. Have you tried it?

Br,
Ivan

2 Likes

Hello @ipazin,
thanks for your answer. I’m aware of the List Files/Folders node and I’m using it already to show me all the files in a respective folder on a remote location. Also, I already looked for an option to list files from a zip archive as well, but I didn’t found it. After your suggestion, I tried it with a local zip file, but it didn’t work either. May be I have overlooked something?

Best,
Johann

1 Like

Hello @wurz,

you are right. I made a silly mistake so got files listed. For a workaround check out this topic:

Br,
Ivan

1 Like

Thanks for your effort anyways. I hoped, I wouldn’t need a workaround for this…

So it would be a great feature in the future for the List Files/Folders node, to be able to read zip files as well.

1 Like

You are welcome.
There is a ticket and I have noted your request there. (Internal Reference: AP-17578)
Ivan

2 Likes

Hi!
+1 from me!
Best regards.

3 Likes

Done @jorgemartcaam
Ivan

1 Like

+1 for me too please @ipazin

1 Like

Hello everyone,
as Ivan already mentioned we are looking into implementing a File Archive Connector node. It’s still in the planning phase so any feedback from you is highly appreciated.
Here is what is planned so far:
The node would have an optional file system connection and the user would select in the node dialog a file archive (zip, jar, tar, …). The output of the node would be a file system that represents this archive. So you could use the file utility nodes to work with the archive. For example, you could use the List Files/Folders node to list the entries of the archive. If supported by the archive you would be also able to read and write single entries from the archive.
What do you think?
Bye
Tobias

5 Likes

Hello @tobias.koetter,
that sounds exactly like I had imagined it! To list files and subdirectories from the archive would be enough for my particular use case, but your thoughts so far are definitely more sustainable than this!

I will think about this feature later on. Maybe I’ve something to add.

Thanks for your responses and answers,
Best,
Johann

2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.