RegEx extract


This is a companion discussion topic for the original entry at https://kni.me/w/Yieb55Qd90lrR4IQ

How would I extract a file name from a path when my file name can have periods in the actual name? Basically, I just want the name that follows the last forward slash and excludes the file extension.

C:\Dan\Energy\raw data\Champion_REFLECTIVE ENERGY SOLUTIONS LLC MD 05.20.2021.xls

So I would want Champion_REFLECTIVE ENERGY SOLUTIONS LLC MD 05.20.2021

Thank you

Hi @Shmelky , you can use the URL to File Path node:

Here’s a quick example I put together:
Input:
image

Output:

You can see the File name in the File name column.

Here’s the workflow:
extract file name.knwf (6.2 KB)

3 Likes

Hi @Shmelky -

For your use case, you might try a different approach from RegEx altogether. The URL to File Path node will take your input string and split it up into the constituent parts - folder, file name, extension, and so on. I think you can just use the file name that is returned, like in the screenshot below:

EDIT: Beaten to the punch by @bruno29a!

3 Likes

lol @ScottF . I think yours literally came in 1 sec after mine. But we proposed the same solution and created the same example. :+1:

2 Likes

Thanks all! Worked like a charm :+1:

1 Like

Hey @ScottF , I found a bug with URL to File Path when the file or folder has a plus sign (+) in the name, which is a valid character. The node replaces the + with a space:

Note: In the above image, I am converting from a string. But it is the same behaviour when converting from a Path. For exmple, List Files/Folders which returns result as Path, then Path to URI, then URL to File Path. The List File/Folders sees the + (that’s where I copied it from and added my Column1 for the above image), and so does Path to URL:
URI: file:///C:/ROMs/NES/Super%20Mario%20Bros.%20+%20Duck%20Hunt%20+%20World%20Class%20Track%20Meet/Super%20Mario%20Bros.%20+%20Duck%20Hunt%20+%20World%20Class%20Track%20Meet%20(U)%20(PRG0)%20%5B!%5D.nes; EXT: nes

It’s only the URL to File Path that removes the + signs.

I’m on Knime 4.3.3. Not sure if this repeats in 4.4.

2 Likes

Hi @bruno29a, I can recreate this in KNIME AP 4.4 as well. It turns out that the developers are aware of the way the + character is encoded in paths and they are working on a fix.

Thanks for the feedback!

(Related tickets: AP-16812, AP-17103)

1 Like

Thanks for looking into this @ScottF