I have emails in a column and I want to split each cell (email) into a body and a metadata column. So far the Cell Splitter node did nothing. I have tried to split the emails at empty lines or certain characters and I always get the same, the copy of the original text without splitting. Is there another way to split a string or better, a document? Here’s an example of an email (it’s in public domain, from the ‘ENRON Corpus’) that I would like to split after ‘X-filename:…’ :
Message-ID: <26127350.1075840042651.JavaMail.evans@thyme> Date: Wed, 23 Jan 2002 17:02:42 -0800 (PST) From: firstname.lastname@example.org Subject: Copier Commitment Information Requested Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-From: Enron Corporate Administrative Services@ENRON X-To: All Enron Employees North America@ENRON X-cc: X-bcc: X-Folder: \ExMerge - Slinger, Ryan\Deleted Items X-Origin: SLINGER-R X-FileName: Houston Offices Bankrupt and Non-Bankrupt Business Units: If you own, lease, rent, or receive invoices for copiers or have questions regarding copiers Please call Harry Grubbs at 713-853-5417, or email@example.com. All Bankrupt entities outside Houston, or other Offices outside Houston that are closing: If you own, lease, rent, or receive invoices for copiers or have questions regarding copiers Please contact Paula Corey at 713-853-9948, or firstname.lastname@example.org. Non-Bankrupt Entities Outside Houston: Please call Harry Grubbs at 713-853-5417, or email@example.com.