Sentence extraction/generating from unstructured data

Hi,

currently I have a txt file with unstructured text which doesn't have any punctuation. Is there any node available that can parse my text and for examle using NLP extract/generate separate sentences? For example:

Input :

the world is big my name is Alice

Output : 

the world is big

my name is Alice

Hi Myla,

Sounds doable, but not with any single node I know. Part-of-speech (POS) tagging should give you hints on where to separate sentences. Though obviously it won't be possible to reproduce nested structures this way.

Regards
E

Moving into the text processing forum.

Hi Myla,

you can try to use the Strings to Document node on these strings and then the Sentence Extractor but I doubt that this will lead to reasonable results. The sentence tokenization is mostly based on punctuation marks.

Cheers, Kilian