How does I combine the exp_mass_to_charge and the Split Value 1 to retention_time

Hi,
I want to combine the compound data from the AccurateMasssearch node and the metabolitespectralmatch node in OpenMS.The Joiner result is empty and the Joiner node seems not suitable.So I want to find a new and effective method to achieve it.




Thank you very much!

Hi @zero,

to me it looks like the joiner node produces an empty output because you are matching on double columns, which are (close to) continuous variables - the joiner only matches if the values are exactly the same, which is hardly the case judging by the table you shared.

What you could do to discretize the values is using the Auto-Binner node on the data of your AccurateMasssearch node and then apply the same binning to the data of your metabolitespectralmatch node with the Auto-Binner (Apply), similar to this:


By adjusting the number of bins you can define accuracy. You’d need to do this to both the “exp_mass_to_charge” column and the “retention_time” column.

Another way that might help already would be to round the data with the Math Formula node.

After discretization, you are ready to use the joiner node on both binned columns. Be aware that if there are several rows in each bin for both tables, the resulting table might quickly blow up - how to handle this depends on what you want to achieve, but the “Duplicate Row Filter” might be an option.

Hope that helps,
Lukas

3 Likes

Thank your reply, I will try it as your suggestion.

1 Like

I try it again,But the joiner is still empty , My meaning may be fuzzy.I want to combine the two table the retention_time to retention_time,and
the mz to mz.One table is about 19000 rows,but another is 1200 rows.
But the retention_time and mz is closely.

@LukasS


I combine the exp_mass_to_charge[Binned],the joiner is ok.I want to know the final I think I need to identify the accuracy.If I inspect manually or there are other simple solution.

Hi @zero,

good to hear that the joiner works now :slight_smile:

If I understood your initial question, I think you want to do the binning for the retention_time and join based on this as well.

You are correct, finding the right accuracy might be difficult - that depends on your data and the outcome you wish. Maybe it makes sense to plot the “exp_mass_to_charge” as a function of “retention_time” to see where a joining would make sense?

Best, Lukas

First thank you very much.
May be this is good for the data,but as a Mass Spectral data it can’t be the relation.
I will consider your advice.

1 Like

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.