Splitting Columns from CSV file in Knime

Hello all , I am having issue separating the columns , I want to ask how do you split the column header according to its individual columns ?

many thanks :slight_smile:

Hi @ineztan9860 and welcome to the forum.

Is it possible you have a fixed-width format file?

It also might be that it’s a space or tab delimited file, in which case you could uncheck the “ignore spaces and tabs” checkbox and select the appropriate delimiter from the dropdown.

If you could post a sample file here in the forum, assuming it’s not confidential data, maybe someone can help further.

1 Like

Is your column delimiter a space?

this is the file that I was given and I need to clean and transform this data .
when I use 'CSV reader" or “File Reader” , it still gave me the same result

I was suspecting that the delimiter is a space but, somehow it did not give me any changes

Can you upload the CSV file itself, as opposed to a screenshot? Excel does not always correctly represent the structure of the file.

1 Like

inez setB.xlsx (865.0 KB)

hi! hopefully you can see it as I have to change my file type from csv to xlsx in order for me to upload.

my original file type must be csv .

Hi @ineztan9860 , as @ScottF suggested, your file is “tab-delimited”, so if you are loading the csv using the CSV Reader, you need to enter \t as the column delimiter as follows:

image

If you are opening it with File Reader you need to specify as column delimiter like this:

image

2 Likes

Hi, i have followed what you said accordingly.
Once i execute, the file table shown like this.

Hi @ineztan9860

That is odd.
My suggestion of it being tabs was based on what appears in the xlsx file you uploaded, as that definitely contained tabs between columns, so I guess it’s possible that it changed during saving, but that isn’t usual. I am happy to take a look at the actual csv file if you rename it as .txt and then you can upload that.

SetB (1).txt (3.6 MB)

hi @takbb i have converted into .txt , do check it out :slight_smile: thank u

Hi @ineztan9860 , ok I now see your problem… :wink:

What has happened is that every line in your file is wrapped inside double quotes, so consequently it sees the whole record as being a single column.

We can give ways round that, but the usual thing here would be to change how that file is being created, (if possible) as that is actually incorrect for a csv-style format.

If you are unable to change how the file is created, let us know and I or somebody else can give suggestions on how to process it, but better to fix it at source if possible.

I’m hoping that you will find a way to change the source of your problem, but I thought this might be useful as an exercise anyway, especially if it turns out you are unable to change the source…

The attached workflow provides two potential ways of handling this. In both cases they strip the double quotes off the ends of each record. In the first part, it demonstrates reading the file, correcting it, writing it out again whereupon it can be read in by a CSV Reader in its now-corrected format. The thing to note here is that the CSV Writer is told not to write the column headers (because the first row of data already contains them!) so they’ll already be there. This process is repeatable… if on reading in, it is processing a file without double-quotes at the beginning and end of the record, it won’t be a problem, but it should be noted that a file containing well-formed CSV but with double-quotes on the first and/or last columns will get broken by this process! So beware!

image

This in effect is a flow allowing you to “correct” the file, which might be useful as a more generic process should you encounter this issue again.

In the second part, it demonstrates how (instead of overwriting the original file) it can simply do the necessary data correction, and then fix the column names on the table. This is the better and safer option if you don’t need the file fixed!


IReading CSV that is incorrectly formatted - two options.knwf (1.2 MB)

2 Likes

and

I think what happened here (and that is a guess) is that @ineztan9860 might have “converted” the file via Excel, and Excel added these quotes, thinking that these are part of the data.

@ineztan9860 , can you instead rename the file extension from .csv to .txt and upload the file?

2 Likes

hello , thank u for your help ,
I managed to solve the problem .

1 Like

and also, thank u @takbb and @ScottF for your guidance as well

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.