Excel Reader

Hi all,

I know, again with eXcel reader......

i've got some column in excel with this kind of content :

Col x

345

Yes

34539456

Butter

Result in Knime is :

345.0

Yes

3.4539456E6

Butter

To clean up in KNIME is not a good way, so how to make KNIME understand in thie case that will be string ? or to fix manually the type of column ???? (I already try File reader : beurk !)

I also read in this forum : why not read all xls/xlsx column as string (pure text) ????? => VERY GOOD IDEA !!!!

Thank's in advance

MitchP

 

Hi Mitch,

I have no experience with this reader, but I like to save my data as csv. It's extremly portable and efficient. Excel can export and import it, too. And if something goes wrong with it, a line reader can help, or you can even manipulate it with a simple editor if all else fails. The only drawback is that you have to watch out for special characters in your data. So maybe that's an alternative?

Hi, i have the same problem. The only solution is to format the data after the xls reader with a java snippet.

In the next version will be useful a xls reader with the possible to change the format of the columns such as the file reader.

Tank's,

I still have problem while transforming double or integer to string (even with java or Number to String) :

3939455606 => 3.939455606E9

345 => 345.0

Lot of energy for this topic !!!!!!

MitchP

Mitch,

maybe this thread could be of interest for you. Bottom line: currently you might have to resort to snippets. But if you have a lot of energy, you could try to enhance the Number To String Node with formatting options? I think that could be a perfect project for someones first custom node, depending on how complicated the settings should be.

Hi Marlin,Darbon,

@Darbon, could you tell me how your code looks like ?

Very interessant, but finally, I do that this way :

- Problem occurs with XLS Reader node and produce a String column.

- Problem occurs with XLS Writer

- Format result are dependent of :
   - Excel column type
   - Excel version (xls, xlsx)
   - Knime version (2.8.2 vs 2.10.2, 32/64bts)
   - .......

So I build this :

- Missing node : missing -> null (Eliminate Warn in console)

- Java Snippet(simple) with this code :

String strReturn = "";
Double dblReturn = 0.0;
DecimalFormat df = new DecimalFormat("0.###");
String Test = $Col0$;

Test= Test.replace("'","");

if(Test.startsWith("0")){
   strReturn = Test;
}
else{
//   if(Test.matches("\\s*") | Test == null | Test == " " | Test.isEmpty() | Test.trim().isEmpty()){ 
   if(Test.trim().isEmpty()){
      strReturn = null;
   }
   else {
      try{
         dblReturn = Double.parseDouble(Test);
         strReturn = df.format(dblReturn);
      }
      catch(NumberFormatException e){
         strReturn = Test;
      }
   }
}

return strReturn;

And check "Insert missing as null"

And so I become the best result .... for my PoV

Cheers,

Mitch

 

Hi Mitch,

that looks good, but I took the liberty to suggest a few modifications

final String VALUE_ON_ERROR = null; // I'm not sure what this should be. <null>? "0"?
DecimalFormat df = new DecimalFormat("0.###");
String tested = $Col0$; // pet peeve: upper case variable names (unless it's a constant) ;)

// the idea to test for null was good
if (tested == null || tested.trim().isEmpty()) {
  return VALUE_ON_ERROR;
}

// the replacement looked unnecessary, as did the startsWith-test
// if you feel the need to include them, make sure the null-test is the first

try {
  Double dblReturn = Double.parseDouble(tested);
  return df.format(dblReturn);
} catch(NumberFormatException e){
  return VALUE_ON_ERROR;
}