How to extract text between two dates.

Can anyone help me to extract text between two dates in knime. There are different format of dates like 10.04.2020, 04/09/2020, 01-29-2020, [TS:20200410054609] and April 10, 2020. I am attaching a file. I need the columnC out of Column B. Can it be done with Regex? If so can you please help me as I am new to Knime . I will be doing text classification.
JhumaText to be extracted.xlsx (10.1 KB)

Hi Jhuma,
that should be possible with a regex split node

hope that helps

1 Like

Hi Daniel,
Thanks for your help. The node is working fine for the file I shared with you, but its not working for my actual data. It shows "371 input string(s) didnot match the pattern or contained more groups than expected". I tried with another file, it worked only for the 3rd record and not the rest… Is it because the records have many dates in them?..How can it be solved. can you please help me… I am attaching another file with 4 records. Also can I include the time stamp along with the date in the pattern.Text to be extracted1.xlsx (8.9 KB)

The text from where I need to extract text between two dates may be in different lines…I mean actually the text is ticket log and there different lines in it. I am adding the last row for your reference.this is the actual text. Please help.Text to be extracted1.xlsx (9.4 KB)

I think the regex is not handling line breaks correctly if the issue only occurs on these texts

Okay… So how can this be handled any idea?



You can try with four or five “string manupulation” nodes, each for different date format and replace the date with " | " for example and then “cell splitter” node with “|” as delimiter

Okay… can you please help me as I am new to knime…

I am not able to use String manipulation node. I used Regex (r’([0-9]{2}.[0-9]{2}.[0-9]{4})’) in the string manipulator node. but every time I am getting the warning " ambiguous return type! Is ‘string()’ or another function from the “Convert Type” category to specify the return type"…So can anyone help on this…

Hi Daniel and Andrejz,
I have solved the issue with String Replacer and Regex Extractor .

Thanks Both.


This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.