XLS Border Formatter

I notice in the XLS Control Table Generator there is a tab for File > Load Settings or Save Settings.
Am I supposed to load something there. I keep getting the error the generated table is not fully valid XLS Formatter Control Table as it contains invalid characters in tag.
I Looked at a formatter that works online and when I copy it into my workflow it fails.
I get the same error message with XLS Border Formatter.
Anybody have a working workflow with XLS Border Formatter and XLS Cell Formatter
Thanks
Scott

Check Applies for all tags.

Hi,
I have the applies to all tags checked. I even tried to uncheck it and it didn’t make a difference.KNIME_project_sample.knwf (12.3 KB)

Hi,
What I want to do is to open up my spreadsheet and have the columns left justified, have the word wrap (word Text in excel) applied so I can look through the spreadsheet and see what is missing for blank cells. Then I can go and add string manipulation to fix the empty cells. From there I will convert it to a CSV file and I am assuming then I would need to get rid of any extra spaces /n errors etc in the CSV file.
Thanks
Scott

Hi -
I might be totally misunderstanding what’s going on here - is this correct:

  • You ultimately want to produce a CSV file (to pass on to someone else, or for whatever reason)
  • As an intermediate step, you want to produce an Excel file to open in Excel
  • You want this Excel file to have all cells formatted so that text is left justified and each cell’s text is wrapped
  • You want to do this so you can visually check to see which cells are blank
  • You want to then use this information about which cells are blank to use a node back in KNIME to fix the empty cells (where “fix” means what?)

Is this all correct?

KNIME_Combine Excel.knwf (55.9 KB)

  • Yes I want a CSV file to import into a python script. but first
  • You want this Excel file to have all cells formatted so that text is left justified and each cell’s text is wrapped
    You want to do this so you can visually check to see which cells are blank
    You want to then use this information about which cells are blank to use a node back in KNIME to fix the empty cells (where “fix” means what?) I did this already but there is still cell’s that are blank so I need to fix them so the blank cells will say Unknown or none
    Once the CSV file is created I need to make sure there were no special characters added to the file or /n etc to make sure the data is clean.

This is what I have so far. Just trying to fix the formatting currently. It has been a challenging process but I enjoy learning new tools.
Thanks
Scott

  • List item

Hmmm… but this is where I’m confused: if you want to produce a CSV, why do you care about formatting in the intermediary (since a CSV has no formatting information attached to it.) ?

Good Question
Maybe I am thinking about this all wrong since I will still need to format and clean the data once the csv is created.
What would you suggest I do.?

When you say you need to format it once the CSV is created - why / in what manner? Some Python script, as i understand from above, will be reading the CSV file - what kind of formatting is it expecting?

There is no special formatting needed with the python program. Only said that so I can review the data in the csv file. My main concern is making sure the data is clean.
Here is the history.
Downloaded excel file from the web
Removed the macros
Combine the excel files using knime
Added an additional column in knime.
Filled in some blank cells with data
Still have some blank cells
This is where I am now.

If you have a rule for filling in the blank cells (e.g “all blank cells get ‘DATA MISSING’ inserted”) then it seems like you could use an existing KNIME node like the String Manipulation node to say ‘if the trimmed [all leading and trailing whitespace removed] cell content has a zero length, then insert the value of XXX into the cell’.

Thanks for the information. I appreciate all the feedback. I ended my Current workflow with the CSV Writer and it looks similar to the Excel File. Which is what I expected. Now when I try to load the file into the File Reader I can’t get it to show the columns like the workflow example I found. Plus If I try the CSV File Reader I get an error message New line in quoted string (or closing quote missing) in line 29 but when I look at the file table there is no data.
Thanks
Scott

Just to be a repetitive pedant, that your CSV file “looks similar to the Excel File” is total random luck since the CSV file contains no styling information and no data type information, so whatever application is displaying the CSV file content for you just happens to be guessing the formatting and types that your Excel file actually has in it.

Now with that PSA out of the way, if you can’t attach a portion of your CSV file (like the first 10 lines) to this thread, could you include a screenshot of the configuration dialog for your File Reader and a screen shot of some of the output table of the File Reader node, and a more descriptive description of the difference between what you’re expecting and what you’re seeing?

Hi,
Here is what the file looks like from the CVS Writer.

Master_id Names Address_1 Address_2 Town State Interests_at_this_location Primary_id_text Primary_id_value Secondary_id_text Secondary_id_value
10001832 ----- 3 Center Rd NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.1249 Map: - Lot: -
10143622 14-lot Subdivision Plan Tax Map: 004 Lot No: 142 Amherst NH Alteration Of Terrain Permit File No: 20050606-02 NO INFO NO INFO
14803 27 Lyndeboro Rd. Residence 27 Lyndeboro Rd NO LOT NO Amherst NH On-premise Use Facility Containing Fuel Oil Site No: 199411004 Project No: 5325
10137969 305.7 Acre Subdivision ADDRESS UNKNOWN NO LOT NO Amherst NH Alteration Of Terrain Permit File No: 19870508 B Permit No: WPS-2166
10125091 4nh Homes Llc Old Manchester Rd. NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.1309 Map: 8 Lot: 85-1
10138958 9 Columbia Drive ADDRESS UNKNOWN NO LOT NO Amherst NH Alteration Of Terrain Permit File No: 19881004 A Permit No: WPP-3387A
10007868 9 Wilkins Rd Wilkins Rd NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.0921 Map: 5 Lot: 118-0
61 ??? Overlook Dr Rtes 122101a NO LOT NO Amherst NH Underground Storage Tank Program Facility Id: 112990 Site No: 200106045
10004895 A. Arsenault 9 Warren Way NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.0187 Map: 2 Lot: 146-42
10003858 A. Barraford 13 Juniper Dr NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.1217 Map: 40 Lot: 3
10107440 A. Bourzikas 41 Embankment Rd NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.0415 Map: 8 Lot: 116
10007352 A. Brown 36 Standish Way NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.117 Map: 1 Lot: 35-24
10007351 A. Brown 36 Standish Way NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.1169 Map: 1 Lot: 35-24
10005218 A. Carraccio 44 Green Rd NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.0408 Map: 7 Lot: 94-3
10004318 A. Chawla 46 Lyndeborough Rd NO LOT NO Amherst NH Water Well Record Well Id (WRB#): 7.1186 Map: 5 Lot: 64-1

File Reader Viewer

File Reader View from Sample example

When you look at the File Reader that my output put together and the File Reader Sample example.
I would like mine to look like the File Reader example with the Columns. I would like it to look like this so there is no special characters @#%/n etc in my csv files

In your file reader configuration screenshot, your “Column delimiter” is listed as “none” - try making this a comma.

Thanks works with my Sample File but not my large combined file with my 5 Combined EXCEL files.
So for now I am trying to figure out the errors.

From that screen shot, it looks like your CSV’s first row is the names of columns, so you probably want to tick the ‘read column headers’ checkbox seen in the screenshot

If your CSV has a heterogeneous per-row column cardinality, click on the “Advanced…” button seen in the screenshot, then the “Short Lines” tab, then the check box for “allow short lines”

I am now missing Data not sure why my Data would get out of sync. Plus I want to get rid of the quotes. I removed the quote support but the quotes are still there.


Thanks for all your support

If you want to strip out quotes, this has been discussed previously in the forum - like: Remove double quotes beginning and ending of a string

This may be a dumb question
I used Csv writer to create the csv file
I just imported it into file reader and created a csv table in the reader but it messed up my table and it is out of sync. Meaning data isn’t lined up in the right columns in certain rows and columns. Not sure if removing the quotes will fix it. But it is worth a try. So if I connect the string manipulation to remove the quotes this may fix my issues.
Is there any other nodes you would recommend before adding the csv file into my python script in order to clean the csv file.