extract the reference word and the following word in the text

Hi;
I want to extract the reference word and the following word in the text.

For example, if “Dead Date: 25/08/2023” is in the text, I want to extract the next 10 characters into a separate column. So take out the date

Can you add an example workflow?

As a result, my goal is to put the deadline in a separate column if the deadline is specified in the text.

Hi @umutcankurt , this sounds like something that is going to have to be specifically tailored to the type of text that you have so I think it would be beneficial if you could upload some sample text and a list of “reference words” along with an indication of the various date formats that you might expect to see in your text.

This to me is the kind of task that is quite tricky using bespoke “rules” because you need to write regex or string manipulation to be able to cover all conceivable free text inputs but we can expect that an AI interface using a large language model would be able to provide good results with little effort (from us).

Hi; @takbb
test1.csv (49.3 KB)
The test data file is attached. 1 word I’m looking for is "Time-limit for tender : " I want to get the date part after the word, in a different column

Hi @umutcankurt , this may be a starting point for you

regex extraction for multiple expressions separated by lines.knwf (55.1 KB)

In the Table Creator at the lower left, there are a number of keywords/expressions:

image

These are keys to look for in the text. I’m assuming that each appears on its own line, as that is how it appears to be presented in the sample file.

The workflow looks for any of these and returns all of the text which follows each of those key expressions. It then pivots the result so that when found, the different values appear in their own column.

I don’t know what you then want to do with the data, but you may need to perform further transformations on the results to return the data in a form that you need.

I’ve decided to obfuscate the data in the image, although it was the sample csv

[Edit: I’ve reuploaded but with a reduced and redacted csv sample as the sample data looked like it may be sensitive]

3 Likes

@umutcankurt , if the sample data you uploaded was “real” data rather than purely test data, please delete your upload of the test1.csv file, and if unable, please flag it and request it be deleted.

@takbb
Thank you very much. Exactly the solution I was looking for and More… :trophy: :wink: :+1:

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.