String manipulation bug?

Dear KNIMErs,

I suspect I found a "string manipulation" bug, unless you can prove me wrong. :) I did the following:

  1. Binned a value into deciles using ranges as labels
  2. Splitting these "range" labels at the comma into an array of 2
  3. Subjected both array columns to toDouble(removeChars($value [Binned]_Arr[0]$, "[()]"))

Result: missing values... :-(

However, if I use toDouble(replaceChars($value [Binned]_Arr[0]$, "[()]", "")) instead it works perfectly!

What am I missing? Blanks? Or is it truly a bug?


I am not sure what is the expected behavior for the removeChars function. It does not escape/quote its arguments, so the pattern [()] will be interpreted as [[()]] with the last bracket as not a proper regular expression. (As I remember any exception will result in missing values.) You can try the following:

 toDouble(removeChars($value [Binned]_Arr[0]$, "[()\]"))

Thanks for reporting the problem. As Gabor pointed out their is an issue with the interpretation of the second argument as regular expression.

The expected behavior is that it takes the 2nd argument as a literal string and any character in it is removed. We'll fix it for 2.10.