Hi,
I want to combine the compound data from the AccurateMasssearch node and the metabolitespectralmatch node in OpenMS.The Joiner result is empty and the Joiner node seems not suitable.So I want to find a new and effective method to achieve it.
to me it looks like the joiner node produces an empty output because you are matching on double columns, which are (close to) continuous variables - the joiner only matches if the values are exactly the same, which is hardly the case judging by the table you shared.
What you could do to discretize the values is using the Auto-Binner node on the data of your AccurateMasssearch node and then apply the same binning to the data of your metabolitespectralmatch node with the Auto-Binner (Apply), similar to this:
By adjusting the number of bins you can define accuracy. You’d need to do this to both the “exp_mass_to_charge” column and the “retention_time” column.
Another way that might help already would be to round the data with the Math Formula node.
After discretization, you are ready to use the joiner node on both binned columns. Be aware that if there are several rows in each bin for both tables, the resulting table might quickly blow up - how to handle this depends on what you want to achieve, but the “Duplicate Row Filter” might be an option.
I try it again,But the joiner is still empty , My meaning may be fuzzy.I want to combine the two table the retention_time to retention_time,and
the mz to mz.One table is about 19000 rows,but another is 1200 rows.
But the retention_time and mz is closely.
I combine the exp_mass_to_charge[Binned],the joiner is ok.I want to know the final I think I need to identify the accuracy.If I inspect manually or there are other simple solution.
You are correct, finding the right accuracy might be difficult - that depends on your data and the outcome you wish. Maybe it makes sense to plot the “exp_mass_to_charge” as a function of “retention_time” to see where a joining would make sense?