Reducing metabolite candidates from AccurateMassSearch node

Hi everyone!
I am not sure if I should post this here because its not an error-question.

I was just wondering, how could it be possible to reduce/get rid of the metabolite canditates(or most of them) when the identification is performed with AccurateMassSearch node. In our case, for every run, we are processing the files/replicates from one biological experiment only. Meaning, every run of our knime workflow processes the technical and biological replicates of one experiment. So I am not sure t-test, etc could do the job as the experiments are processed independently from each other. Is there any score value I could have missed?

The point is that we take thousands of metabolites for every experiment and therefore many of the candidates are basically noise.


I think after FeatureFinding there should be a “quality” column for each feature that describes how nice the elution profiles look like and how well the isotopic traces align.

After AccurateMassSearch there might be an additional isotope_score that tells you how well the isotopic distribution matches the expected one based on the elemental composition. We need to double-check this.

I agree, a signal-to-noise measurement for every feature might be nice. I don’t think it is there by default. You can always use pyopenms in a python script node to iterate over the features, extract their region from the original experiment and calculate some statistic by yourself.

In general, though, estimating the confidence of IDs in metabolomics is rather hard. We are planning to do some false discovery estimation of features by generating false/decoy features that are shifted and use their score distribution as a baseline. Not sure when that will come though.

1 Like

Thank you so much for your answer! Really helpful!

  1. For the FeatureFinderMetabo, yes I found the quality column. I mostly get values of almost 0 and some really low values, i.e. from 5.734668E-7 to 0.11 with the default renderer. Is it safe to say that the bigger the value the better the match? Also what is the range of values of the “quality” column?

  2. I also found the isotope_score setting in the AccurateMassSearch which needs to be set to true and the result is reflected on the “opt_global_isosim_score” column. It seems that the range is from -1 to 1, is that correct? Also in the description, it says: “Computes a similarity score for each hit (only if the feature exhibits at least two isotopic mass traces).” What does it do for the features that exhibit one isotopic mass trace? Are they reported as -1?

I am wondering which of the two above cases (or both) makes more sense to utilize in order to reduce the number of identified metabolites.


so I had a look:

  1. Is an intensity-weighted sum of the product of an mz_score (that is 1 if it lies in an expected range based on the considered elements and decreases with distance) and RT cosine similarities (on intensities) for every pair of mass traces in a feature. Therefore between 0 and 1.

  2. Is a cosine similarity on the apices of the traces, therefore between 0 and 1. -1 to indicate that there was only one trace.

Both make sense to take into account. 1 is more of a measure on if the feature makes sense, 2 is a measure to pick between multiple identifications of a feature. If you need more distinction, you will need fragment spectra.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.