Molecule Functionality node Request

Hi,

Is it possible to have a "Molecular Functionality" node, which works like the Molecular Property node, but instead of selecting from a list of properties like "number of atoms" or "Molecular Weight", you select from a list of functional groups. This creates new columns in which the column name is the name of the selected structure column with the functional group as a suffix, and in this column in counts the number of instances of it.

So some example groups would be: amines, carboxylic acids, amides, esters, phenols, ethers, alcohols, bromo, chloro, iodo, fluoro, boronic acid, boronate ester, stannane, pyridine, phenyl, nitro, nitrile, oxime, phosphate, amidine, urea, carbamate, ketone, aldehyde, imine, alkene, alkyne, imidazole, pyrazole, triazole, etc.

So if the molecule has two amide groups in the Indigo column called "Structure", then in the new column called "Structures - Amides" it reports a "2".

These can be very useful to look for trends between groups like number of "amides" and Blood Brain Barrier Penetration, and between increased number of "amines" and improved solubility or increased phospholipidosis issues etc.

I hope its possible.

Thanks

Simon.

Simon,

Counting the functional groups of certain type boils down to counting the matches of a certain SMARTS expression. Currently, Indigo can do that with the Substructure Match Counter node. But I absolutely agree that having a special node for the functional groups will be useful. We will do that, as soon as we get the SMARTS expressions for all the functional groups you listed :) If you know a place where we can find them, please let me know. But we will do them anyway.

 

Regards,

Dmitry

Hi,

Such a node would indeed be very important. Note that  tree is a more logical representation since a carbonyl can be a aldehyde, ketone or a carboxylic acid Here are the functional group patterns organized as a logical XML tree.  Please test them first as these were used with CDK earlier and not indigo.

 

  <Molecule>
    <Aliphatic>
      <Alkane Smarts="[CX4;$([H3][#6]),$([H2]([#6])[#6]),$([H1]([#6])([#6])[#6]),$([#6]([#6])([#6])([#6])[#6])]">
        <Primary Smarts="[CX4H3][#6]"></Primary>
        <Secondary Smarts="[CX4H2]([#6])[#6]"></Secondary>
        <Tertiary Smarts="[CX4H1]([#6])([#6])[#6]"></Tertiary>
        <Quaternary Smarts="[CX4]([#6])([#6])([#6])[#6]"/>
      </Alkane>
      <Alkene Smarts="[CX3;$([H2]),$([H1][#6]),$(C([#6])[#6])]=[CX3;$([H2]),$([H1][#6]),$(C([#6])[#6])]"></Alkene>
      <Alkyne Smarts="[CX2]#[CX2]"></Alkyne>
      <Allene Smarts="[CX3]=[CX2]=[CX3]"></Allene>
    </Aliphatic>
    <Aromatic>
      <Aniline Smarts="[c][NX3H2]"></Aniline>
      <Benzenering Smarts="c1ccccc1"></Benzenering>
      <Carbazole Smarts="c1c3c(ccc1)c2ccccc2n3-[$([#1]),$([#6])]"></Carbazole>
      <Iminoarene Smarts="[*;r6]=[NX2]"></Iminoarene>
      <Oxoarene Smarts="[*;r6]=[O]"></Oxoarene>
      <Phenol Smarts="[c][OX2H]"></Phenol>
      <Thioarene Smarts="[*;r6]=[S]"></Thioarene>
    </Aromatic>
    <CHN>
      <Amidine Smarts="[NX3][C]=[NX2]"></Amidine>
      <Amine Smarts="[NX3,NX3+0,NX4+;!$([N]~[#7,#8,#15,#16])]">
        <Primary Smarts="[NX3H2,NX3H2+0,NX4H3+;!$([N]~[#7,#8,#15,#16])]"></Primary>
        <Secondary Smarts="[NX3H1,NX3H1+0,NX4H2+;!$([N]~[#7,#8,#15,#16])]"></Secondary>
        <Tertiary Smarts="[NX3H0,NX3H0+0,NX4H1+;!$([N]~[#7,#8,#15,#16])]"></Tertiary>
        <Ammonium Smarts="[NH4+]"></Ammonium>
    <Ammonia Smarts="[NH3]"></Ammonia>
      </Amine>
      <Aminal Smarts="[NX3v3;!$(NC=[#7,#8,#15,#16])]~[CX4;!$(C(N)(N)[!#6])]~[NX3v3;!$(NC=[#7,#8,#15,#16])]"></Aminal>
      <Aziridine Smarts="C1CN1"></Aziridine>
      <Azide Smarts="[NX1]~[NX2]~[NX2,NX1]"></Azide>
      <Acylazide Smarts="[CX3](=[OX1])[NX2]~[NX2]~[NX1]"></Acylazide>
      <Azo Smarts="[#6]-[#7]=[#7]-[#6]"></Azo>
      <Diazo Smarts="[$([#6]=[N+]=[N-]),$([#6-]-[N+]#[N])]"></Diazo>
      <Diazonium Smarts="[#6][NX2+]#[NX1]"></Diazonium>
      <Carbodiimide Smarts="[NX2]=[CX2]=[NX2]"></Carbodiimide>
      <Enamine Smarts="[NX3;!$(NC~[!#1!#6])][CX3]=[CX3]"></Enamine>
      <Guanidine Smarts="[N;v3X3,v4X4+][CX3](=[N;v3X2,v4X3+])[N;v3X3,v4X4+]"></Guanidine>
      <Hydrazine Smarts="[NX3;$([H2]),$([H1][#6]),$([H0]([#6])[#6]);!$(NC=[O,N,S])][NX3;$([H2]),$([H1][#6]),$([H0]([#6])[#6]);!$(NC=[O,N,S])]"></Hydrazine>
      <Hydrazone Smarts="[NX3;$([H2]),$([H1][#6]),$([H0]([#6])[#6]);!$(NC=[O,N,S])][NX2]=[#6]"></Hydrazone>
      <Iminyl Smarts="[NX2;$([N][#6]),$([NH]);!$([N][CX3]=[#7,#8,#15,#16])]=[CX3;$([CH2]),$([CH][#6]),$([C]([#6])[#6])]">
        <Aldimine Smarts="[NX2;$([H1]),$([H0][#6])]=[CX3;$([H2]),$([H1])]">
          <Primary Smarts="[NX2;$([H1])]=[CX3;$([H2]),$([H1])]"></Primary>
          <Secondary Smarts="[NX2;$([H0][#6])]=[CX3;$([H2]),$([H1])]"></Secondary>
        </Aldimine>
        <ketimine Smarts="[NX2;$([H1]),$([H0][#6])]=[CX3;$([H0][#6])]">
      <Primary Smarts="[NX2;$([H1])]=[CX3;$([H0][#6])]"></Primary>
          <Secondary Smarts="[NX2;$([H0][#6])]=[CX3;$([H0][#6])]"></Secondary>
        </ketimine>
      </Iminyl>
      <Isocyanide Smarts="[N+]#[C-]"></Isocyanide>
      <Nitrile Smarts="[NX1]#[CX2]"></Nitrile>
      <Thiosemicarbazone Smarts="[#7X2](=[#6])[#7X3][#6X3]([#7X3;!$([#7][#7])])=[SX1]"></Thiosemicarbazone>
    </CHN>
    <CHNO>
      <AcylNitrile Smarts="[NX1]#[CX2][CX3]=[OX1]"></AcylNitrile>
      <Amide Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[#7X3H2,#7X3H1,#7X3H0]">
        <Primary Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[NX3H2]"></Primary>
        <Secondary Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[#7X3H1][#6;!$(C=[O,N,S])]"></Secondary>
        <Tertiary Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[#7X3H0]([#6;!$(C=[O,N,S])])[#6;!$(C=[O,N,S])]"></Tertiary>
      </Amide>
      <Azoxy Smarts="[#6][$([NX2]=[NX3+]([O-])[#6]),$([NX2]=[NX3+0](=[O])[#6])]"></Azoxy>
      <Carbamate Smarts="[NX3,NX4+][CX3](=[OX1])[OX2,OX1-]"></Carbamate>
      <Cyanate Smarts="[OX2][CX2]#[NX1]"></Cyanate>
      <Enamide Smarts="[CX3]=[CX3]~[CX3;$([R0][#6]),$([H1R0])](=[OX1])[NX3]"></Enamide>
      <Hydroxamicacid Smarts="[OX1]=[CX3][NX3;$([H1][OH])]"></Hydroxamicacid>
      <Hydroxylamine Smarts="[NX3;$([H2]),$([H1][#6]),$([H0]([#6])[#6]);!$(NC=[O,N,S])][OX2H1,OX2H0]"></Hydroxylamine>
      <Hemiaminal Smarts="[NX3,NX3+0,NX4+;!$([N]~[#7,#8,#15,#16])]~[CX4]~[OX2H]"></Hemiaminal>
      <Imidate Smarts="[NX2;$([H1]),$([H0][#6])]=[CX3;$([H0][#8])]"></Imidate>
      <Imide Smarts="[CX3](=[OX1])[NX3][CX3](=[OX1])"></Imide>
      <Isocyanate Smarts="[NX2]=[CX2]=[OX1]"></Isocyanate>
      <Lactam Smarts="[#6R][#6X3R](=[OX1])[#7X3H2,#7X3H1,#7X3H0]"></Lactam>
      <Nitrate Smarts="[NX3+]([OX1-])(=[OX1])([OX1-])"></Nitrate>
      <Nitrite Smarts="[NX2](=[OX1])[O;$([X2]),$([X1-])]"></Nitrite>
      <Nitro Smarts="[$([NX3](=O)=O),$([NX3+](=O)[O-])][!#8]"></Nitro>
      <Nitrone Smarts="[NX3+;$([H1][OX1-]),$([H0][OX1-])]=[C]"></Nitrone>
      <Nitroso Smarts="[NX2](=[OX1])[!#7;!#8]"></Nitroso>
      <Nitrosamine Smarts="[NX3;!$(N=O)][NX2]=[OX1]"></Nitrosamine>
      <Oxime Smarts="[NX2](=[CX3;$([CH2]),$([CH][#6]),$([C]([#6])[#6])])[OX2H]"></Oxime>
      <Semicarbazone Smarts="[#7X2](=[#6])[#7X3][#6X3]([#7X3;!$([#7][#7])])=[OX1]"></Semicarbazone>
      <Semicarbazide Smarts="[#7X3][#7X3][#6X3]([#7X3;!$([#7][#7])])=[OX1]"></Semicarbazide>
      <Thionitrite Smarts="[SX2][NX2]=[OX1]"></Thionitrite>
      <OximEther Smarts="[NX2](=[CX3;$([CH2]),$([CH][#6]),$([C]([#6])[#6])])[OX2][#6;!$(C=[#7,#8])]"></OximEther>
      <Urea Smarts="[NX3][CX3](=[OX1])[NX3]"></Urea>
      <Urethane Smarts="[NX3,NX4+][CX3](=[OX1])[OX2,OX1-]"></Urethane>
    </CHNO>
    <CHO>
      <Acetal Smarts="[CX4][OX2][CX4H1]([#6])[OX2][CX4]"></Acetal>
      <Anhydride Smarts="[CX3;$([H0][#6]),$([H1])](=[OX1])[#8X2][CX3;$([H0][#6]),$([H1])](=[OX1])"></Anhydride>
      <Alcohol Smarts="[OX2H][CX4;!$(C([OX2H])[O,S,#7,#15])]">
        <Primary Smarts="[OX2H][CX4H2;!$(C([OX2H])[O,S,#7,#15])]"></Primary>
        <Secondary Smarts="[OX2H][CX4H;!$(C([OX2H])[O,S,#7,#15])]"></Secondary>
        <Tertiary Smarts="[OX2H][CX4;$([H0])]"></Tertiary>
      </Alcohol>
      <AminoAlcohol-1-2 Smarts="[OX2H][CX4;!$(C([OX2H])[O,S,#7,#15,F,Cl,Br,I])][CX4;!$(C([N])[O,S,#7,#15])][NX3,NX4+;!$(NC=[O,S,N])]"></AminoAlcohol-1-2>
      <Carbonate Smarts="[CX3](=[OX1])([OX2][CX4])[OX2][CX4]"></Carbonate>
      <Diol-1-2 Smarts="[OX2H][CX4;!$(C([OX2H])[O,S,#7,#15])][CX4;!$(C([OX2H])[O,S,#7,#15])][OX2H]"></Diol-1-2>
      <Diol-1-1 Smarts="[OX2H][CX4;!$(C([OX2H])([OX2H])[O,S,#7,#15])][OX2H]"></Diol-1-1>
      <Carbonyl Smarts="[CX3]=[OX1]">
        <Aldehyde Smarts="[$([CX3H][#6]),$([CX3H2])]=[OX1]"></Aldehyde>
        <Ketone Smarts="[#6][CX3](=[OX1])[#6]">
          <Enone Smarts="[CX3]=[CX3]~[CX3;$([R0][#6]),$([H1R0])](=[OX1])[CX4]"></Enone>
        </Ketone>
        <CarboxylicAcidDerivative Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[$([OX2H]),$([OX1-]),$([OX2][#6;!$(C=[O,N,S])])]">
          <CarboxylicAcid Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[$([OX2H]),$([OX1-])]"></CarboxylicAcid>
          <CarboxylicEster Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[OX2][#6;!$(C=[O,N,S])]"></CarboxylicEster>
        </CarboxylicAcidDerivative>
      </Carbonyl>
      <Enol Smarts="[OX2H][CX3]=[CX3]"></Enol>
      <EnolEster Smarts="[OX2][CX3](=[CX3])[CX3]=[O]"></EnolEster>
      <EnolEther Smarts="[OX2]([#6;!$(C=[N,O,S])])[CX3;$([H0][#6]),$([H1])]=[CX3]"></EnolEther>
      <Epoxide Smarts="[OX2r3]1[#6r3][#6r3]1"></Epoxide>
      <Ether Smarts="[OD2;!$(OC~[!#1!#6])]([#6])[#6]"></Ether>
      <Ketene Smarts="[CX3]=[CX2]=[OX1]"></Ketene>
      <Hemiacetal Smarts="[OX2H][CX4H1,!$(C(O)(O)[!#6])][OX2][#6;!$(C=[O,S,N])]"></Hemiacetal>
      <Hydrate Smarts="[OX2;$([H2])]"></Hydrate>
      <Lactone Smarts="[#6][#6X3R](=[OX1])[#8X2][#6;!$(C=[O,N,S])]"></Lactone>
      <Peroxide Smarts="[OX2,OX1-][OX2,OX1-]">
        <Hydroperoxide Smarts="[OX2H][OX2]"></Hydroperoxide>
      </Peroxide>
    </CHO>
    <Halogens>
      <Acidhalide Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[FX1,ClX1,BrX1,IX1]">
        <AcidBromide Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[BrX1]"></AcidBromide>
        <AcidChloride Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[ClX1]"></AcidChloride>
        <AcidFluoride Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[FX1]"></AcidFluoride>
        <AcidIodide Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[IX1]"></AcidIodide>
      </Acidhalide>
      <AlkenylHalide Smarts="[C]=[C][FX1,ClX1,BrX1,IX1]">
        <AlkenylBromide Smarts="[C]=[C][BrX1]"></AlkenylBromide>
        <AlkenylChloride Smarts="[C]=[C][ClX1]"></AlkenylChloride>
        <AlkenylFluoride Smarts="[C]=[C][FX1]"></AlkenylFluoride>
        <AlkenylIodide Smarts="[C]=[C][IX1]"></AlkenylIodide>
      </AlkenylHalide>
      <AlkyneHalide Smarts="[C]#[C][FX1,ClX1,BrX1,IX1]">
        <AlkyneBromide Smarts="[C]#[C][BrX1]"></AlkyneBromide>
        <AlkyneChloride Smarts="[C]#[C][ClX1]"></AlkyneChloride>
        <AlkyneFluoride Smarts="[C]#[C][FX1]"></AlkyneFluoride>
        <AlkyneIodide Smarts="[C]#[C][IX1]"></AlkyneIodide>
      </AlkyneHalide>
      <AlkylHalide Smarts="[CX4][FX1,ClX1,BrX1,IX1]">
        <AlkylBromide Smarts="[BrX1][CX4]"></AlkylBromide>
        <AlkylChloride Smarts="[ClX1][CX4]"></AlkylChloride>
        <AlkylFluoride Smarts="[FX1][CX4]"></AlkylFluoride>
        <AlkylIodide Smarts="[IX1][CX4]"></AlkylIodide>
      </AlkylHalide>
      <ArylHalide Smarts="c1ccccc1[FX1,ClX1,BrX1,IX1]">
        <ArylBromide Smarts="c1ccccc1[BrX1]"></ArylBromide>
        <ArylChloride Smarts="c1ccccc1[ClX1]"></ArylChloride>
        <ArylFluoride Smarts="c1ccccc1[FX1]"></ArylFluoride>
        <ArylIodide Smarts="c1ccccc1[IX1]"></ArylIodide>
      </ArylHalide>
    </Halogens>
    <Organometallic>
      <MetallicCompounds Smarts="[Li,Na,K,Rb,Cs,Be,Ca,Sr,Mg,Ba,Sc,Ti,V,Cr,Mn,Fe,Co,Ni,Cu,Zn,Al,Ga,Y,Zr,Nb,Mo,Tc,Ru,Rh,Pd,Ag,Cd,In,Sn,Lu,Hf,Ta,W,Re,Os,Ir,Pt,Au,Hg,Tl,Pb,Bi]">
        <LithiumIon Smarts="[Li+]"></LithiumIon>
        <Lithium Smarts="[LiX1]"></Lithium>
        <SodiumIon Smarts="[Na+]"></SodiumIon>
        <Sodium Smarts="[NaX1]"></Sodium>
        <PotassiumIon Smarts="[K+]"></PotassiumIon>
        <Potassium Smarts="[KX1]"></Potassium>
        <RubidiumIon Smarts="[Rb+]"></RubidiumIon>
        <Rubidium Smarts="[RbX1]"></Rubidium>
        <CaesiumIon Smarts="[Cs+]"></CaesiumIon>
        <Caesium Smarts="[CsX1]"></Caesium>
        <BerylliumIon Smarts="[Be++]"></BerylliumIon>
        <Beryllium Smarts="[BeX2]"></Beryllium>
        <CalciumIon Smarts="[Ca++]"></CalciumIon>
        <Calcium Smarts="[CaX2]"></Calcium>
        <StrontiumIon Smarts="[Sr++]"></StrontiumIon>
        <Strontium Smarts="[SrX2]"></Strontium>
        <MagnesiumIon Smarts="[Mg++]"></MagnesiumIon>
        <Magnesium Smarts="[MgX2]"></Magnesium>
        <BariumIon Smarts="[Ba++]"></BariumIon>
        <Barium Smarts="[BaX2]"></Barium>
        <ScandiumIon Smarts="[Sc++]"></ScandiumIon>
        <Scandium Smarts="[ScX3]"></Scandium>
    <TitaniumIon Smarts="[Ti++++,Ti++++,Ti++]">
          <Titanium4Ion Smarts="[Ti++++]"></Titanium4Ion>
          <Titanium3Ion Smarts="[Ti+++]"></Titanium3Ion>
          <Titanium2Ion Smarts="[Ti++]"></Titanium2Ion>
        </TitaniumIon>
        <Titanium Samrts="[TiX4,TiX3]"></Titanium>
    <VanadiumIon Smarts="[V+++++,V++++,V+++,V++]">
          <Vanadium5Ion Smarts="[V+++++]"></Vanadium5Ion>
          <Vanadium4Ion Smarts="[V++++]"></Vanadium4Ion>
          <Vanadium3Ion Smarts="[V+++]"></Vanadium3Ion>
          <Vanadium2Ion Smarts="[V++]"></Vanadium2Ion>
        </VanadiumIon>
        <Vanadium Smarts="[VX5,VX4,VX3,VX2]"></Vanadium>
    <ChromiumIon Smarts="[Cr+++,Cr++]">
          <Chromium3Ion Smarts="[Cr+++]"></Chromium3Ion>
          <Chromium2Ion Smarts="[Cr++]"></Chromium2Ion>
        </ChromiumIon>
        <Chromium Smarts="[CrX6,CrX3,CrX2]"></Chromium>
    <ManganeseIon Smarts="[Mn+++++++,Mn++++++,Mn++++,Mn+++,Mn++]">
          <Manganese7Ion Smarts="[Mn+++++++]"></Manganese7Ion>
          <Manganese6Ion Smarts="[Mn++++++]"></Manganese6Ion>
          <Manganese4Ion Smarts="[Mn++++]"></Manganese4Ion>
          <Manganese3Ion Smarts="[Mn+++]"></Manganese3Ion>
          <Manganese2Ion Smarts="[Mn++]"></Manganese2Ion>
        </ManganeseIon>
        <Manganese Smarts="[MnX7,MnX6,MnX5,MnX4,MnX3,MnX2]"></Manganese>
    <IronIon Smarts="[Fe+++,Fe++]">
          <Iron3Ion Smarts="[Fe+++]"></Iron3Ion>
          <Iron2Ion Smarts="[Fe++]"></Iron2Ion>
        </IronIon>
        <Iron Smarts="[FeX3,FeX2]"></Iron>
    <CobaltIon Smarts="[Co+++,Co++]">
          <Cobalt3Ion Smarts="[Co+++]"></Cobalt3Ion>
          <Cobalt2Ion Smarts="[Co++]"></Cobalt2Ion>
        </CobaltIon>
        <Cobalt Smarts="[CoX3,CoX2]"></Cobalt>
    <NickelIon Smarts="[Ni+++,Ni++]">
          <Nickel3Ion Smarts="[Ni+++]"></Nickel3Ion>
          <Nickel2Ion Smarts="[Ni++]"></Nickel2Ion>
        </NickelIon>
        <Nickel Smarts="[NiX3,NiX2]"></Nickel>
    <CopperIon Smarts="[Cu++,Cu+]">
          <Copper2Ion Smarts="[Cu++]"></Copper2Ion>
          <Copper1Ion Smarts="[Cu+]"></Copper1Ion>
        </CopperIon>
        <Copper Smarts="[CuX2]"></Copper>
        <zincIon Smarts="[Zn++]"></zincIon>
        <zinc Smarts="[ZnX2]"></zinc>
        <AluminiumIon Smarts="[Al+++]"></AluminiumIon>
        <Aluminium Smarts="[AlX3]"></Aluminium>
        <GalliumIon Smarts="[Ga+++]"></GalliumIon>
        <Gallium Smarts="[GaX3,GaX2]"></Gallium>
        <YttriumIon Smarts="[Y+++]"></YttriumIon>
        <Yttrium Smarts="[YX3]"></Yttrium>
        <ZirconiumIon Smarts="[Zr++++]"></ZirconiumIon>
        <Zirconium Smarts="[ZrX4,ZrX3,ZrX2]"></Zirconium>
    <NiobiumIon Smarts="[Nb+++++,Nb++++,Nb+++,Nb++]">
          <Niobium5Ion Smarts="[Nb+++++]"></Niobium5Ion>
          <Niobium4Ion Smarts="[Nb++++]"></Niobium4Ion>
          <Niobium3Ion Smarts="[Nb+++]"></Niobium3Ion>
          <Niobium2Ion Smarts="[Nb++]"></Niobium2Ion>
        </NiobiumIon>
        <Niobium Smarts="[NbX5,NbX4,NbX3,NbX2]"></Niobium>
    <MolybdenumIon Smarts="[Mo++++++,Mo+++++,Mo++++,Mo+++,Mo++]">
          <Molybdenum6Ion Smarts="[Mo++++++]"></Molybdenum6Ion>
          <Molybdenum5Ion Smarts="[Mo+++++]"></Molybdenum5Ion>
          <Molybdenum4Ion Smarts="[Mo++++]"></Molybdenum4Ion>
          <Molybdenum3Ion Smarts="[Mo+++]"></Molybdenum3Ion>
          <Molybdenum2Ion Smarts="[Mo++]"></Molybdenum2Ion>
        </MolybdenumIon>
        <Molybdenum Smarts="[MoX6,MoX5,MoX4,MoX3,MoX2]"></Molybdenum>
        <Technetium Smarts="[TcX7,TcX6,TcX5,TcX4,TcX2]"></Technetium>
    <RutheniumIon Smarts="[Ru++++,Ru+++,Ru++]">
          <Ruthenium4Ion Smarts="[Ru++++]"></Ruthenium4Ion>
          <Ruthenium3Ion Smarts="[Ru+++]"></Ruthenium3Ion>
          <Ruthenium2Ion Smarts="[Ru++]"></Ruthenium2Ion>
        </RutheniumIon>
        <Ruthenium Smarts="[RuX4,RuX3,RuX2]"></Ruthenium>
    <RhodiumIon Smarts="[Rh+++,Rh++]">
          <Rhodium3Ion Smarts="[Rh+++]"></Rhodium3Ion>
          <Rhodium2Ion Smarts="[Rh++]"></Rhodium2Ion>
        </RhodiumIon>
        <Rhodium Smarts="[RhX6,RhX5,RhX4,RhX3,RhX2]"></Rhodium>
    <PalladiumIon Smarts="[Pd++++,Pd++]">
          <Palladium3Ion Smarts="[Pd++++]"></Palladium3Ion>
          <Palladium2Ion Smarts="[Pd++]"></Palladium2Ion>
        </PalladiumIon>
        <Palladium Smarts="[PdX4,PdX2]"></Palladium>
    <SilverIon Smarts="[Ag+++,Ag++,Ag+]">
          <Silver3Ion Smarts="[Ag+++]"></Silver3Ion>
          <Silver2Ion Smarts="[Ag++]"></Silver2Ion>
          <Silver1Ion Smarts="[Ag+]"></Silver1Ion>
        </SilverIon>
        <Silver Smarts="[AgX3,AgX2,AgX2]"></Silver>
        <CadmiumIon Smarts="[Cd++]"></CadmiumIon>
        <Cadmium Smarts="[CdX2]"></Cadmium>
        <Indium Smarts="[InX3]"></Indium>
    <TinIon Smarts="[Sn++++,Sn++]">
          <TinIon Smarts="[Sn++++]"></TinIon>
          <TinIon Smarts="[Sn++]"></TinIon>
        </TinIon>
        <Tin Smarts="[SnX4,SnX2]"></Tin>
        <Lutetium Smarts="[LuX3]"></Lutetium>
        <Hafnium Smarts="[HfX4]"></Hafnium>
        <Tantalum Smarts="[TaX5,TaX4,TaX3]"></Tantalum>
        <Tungsten Smarts="[WX6,WX5,WX4,WX3,WX2,WX2]"></Tungsten>
        <Rhenium Smarts="[ReX7,ReX6,ReX4,ReX2]"></Rhenium>
        <Osmium Smarts="[OsX4,OsX3]"></Osmium>
        <Iridium Smarts="[IrX6,IrX4,IrX3,IrX2]"></Iridium>
        <Platinum Smarts="[PtX4,PtX3,PtX2]"></Platinum>
        <Gold Smarts="[AuX3,AuX1]"></Gold>
        <Mercury Smarts="[HgX2,HgX1]"></Mercury>
        <Thallium Smarts="[TlX3,TlX1]"></Thallium>
        <Lead Smarts="[PbX4,PbX2]"></Lead>
    <Bismuth Smarts="[BiX5,BiX3]"></Bismuth>
      </MetallicCompounds>
      <Metalloid Smarts="[B,Si,Ge,As,Sb,Te,At]">
        <Boron Smarts="[BX3,BX4-]"></Boron>
        <Silicon Smarts="[SiX4,SiX3,SiX2,SiX1,SiX4-,SiX3-,SiX2-,SiX1-]"></Silicon>
        <Germanium Smarts="[GeX4,GeX3,GeX2,GeX1,GeX4-,GeX3-,GeX2-,GeX1-]"></Germanium>
        <Arsenic Smarts="[AsX5,AsX3,AsX2,AsX1,AsX3-]"></Arsenic>
        <Antimony Smarts="[SbX5,SbX4+,SbX3,SbX3-]"></Antimony>
        <Tellurium Smarts="[TeX6,TeX5,TeX4,TeX2,TeX2-]"></Tellurium>
      </Metalloid>
    </Organometallic>
    <Phosphorus>
      <PhosphoricAcidDerivative Smarts="[PX4D4](=[!#6])([!#6])([!#6])[!#6]">
        <PhosphoricAcid Smarts="[PX4D4](=[OX1])([$([OX2H]),$([OX1-])])([$([OX2H]),$([OX1-])])[$([OX2H]),$([OX1-])]"></PhosphoricAcid>
        <PhosphoricMonoester Smarts="[PX4D4](=[OX1])([$([OX2H]),$([OX1-])])([$([OX2H]),$([OX1-])])[OX2][#6;!$(C=[O,N,S])]"></PhosphoricMonoester>
        <PhosphoricDiester Smarts="[PX4D4](=[OX1])([$([OX2H]),$([OX1-])])([OX2][#6;!$(C=[O,N,S])])[OX2][#6;!$(C=[O,N,S])]"></PhosphoricDiester>
        <PhosphoricTriester Smarts="[PX4D4](=[OX1])([OX2][#6;!$(C=[O,N,S])])([OX2][#6;!$(C=[O,N,S])])[OX2][#6;!$(C=[O,N,S])]"></PhosphoricTriester>
        <PhosphoricMonoamide Smarts="[PX4D4](=[#8])([#8])([#8])[#7X3H2,#7X3H1,#7X3H0]"></PhosphoricMonoamide>
        <PhosphoricDiamide Smarts="[PX4D4](=[#8])([#8])([#7X3H2,#7X3H1,#7X3H0])[#7X3H2,#7X3H1,#7X3H0]"></PhosphoricDiamide>
        <PhosphoricTriamide Smarts="[PX4D4](=[#8])([#7X3H2,#7X3H1,#7X3H0])([#7X3H2,#7X3H1,#7X3H0])[#7X3H2,#7X3H1,#7X3H0]"></PhosphoricTriamide>
      </PhosphoricAcidDerivative>
      <Phosphane Smarts="[PX3;$([H3]),$([H2][#6]),$([H1]([#6])[#6]),$([H0]([#6])([#6])[#6])]"></Phosphane>
      <PhosphineOxide Smarts="[PX4;$([H3]=[OX1]),$([H2](=[OX1])[#6]),$([H1](=[OX1])([#6])[#6]),$([H0](=[OX1])([#6])([#6])[#6])]"></PhosphineOxide>
      <PhosphineSulfide Smarts="[PX4;$([H3]=[SX1]),$([H2](=[SX1])[#6]),$([H1](=[SX1])([#6])[#6]),$([H0](=[SX1])([#6])([#6])[#6])]"></PhosphineSulfide>
      <PhosphonicAcidDerivative Smarts="[PX4;$([H1]),$([H0][#6])](=[#8])([!#6])[!#6]">
        <PhosphonicAcid Smarts="[PX4;$([H1]),$([H0][#6])](=[OX1])([$([OX2H]),$([OX1-])])[$([OX2H]),$([OX1-])]"></PhosphonicAcid>
        <PhosphonicMonoester Smarts="[PX4;$([H1]),$([H0][#6])](=[OX1])([$([OX2H]),$([OX1-])])[OX2][#6;!$(C=[O,N,S])]"></PhosphonicMonoester>
        <PhosphonicDiester Smarts="[PX4;$([H1]),$([H0][#6])](=[OX1])([OX2][#6;!$(C=[O,N,S])])[OX2][#6;!$(C=[O,N,S])]"></PhosphonicDiester>
        <PhosphonicMonoamide Smarts="[PX4;$([H1]),$([H0][#6])](=[OX1])([$([OX2H]),$([OX1-])])[#7X3H2,#7X3H1,#7X3H0]"></PhosphonicMonoamide>
        <PhosphonicDiamide Smarts="[PX4;$([H1]),$([H0][#6])](=[OX1])([#7X3H2,#7X3H1,#7X3H0])[#7X3H2,#7X3H1,#7X3H0]"></PhosphonicDiamide>
        <PhosphonicEsteramide Smarts="[PX4;$([H1]),$([H0][#6])](=[OX1])([OX2][#6;!$(C=[O,N,S])])[#7X3H2,#7X3H1,#7X3H0]"></PhosphonicEsteramide>
      </PhosphonicAcidDerivative>
      <Phosphonium Smarts="[PX4+;!$([P]*~[#7,#8,#15,#16])]"></Phosphonium>
      <PhosphinicAcidDerivative Smarts="[PX4;$([H2]),$([H1][#6]),$([H0]([#6])[#6])](=[#8])[!#6]">
        <PhosphinicAcid Smarts="[PX4;$([H2]),$([H1][#6]),$([H0]([#6])[#6])](=[OX1])[$([OX2H]),$([OX1-])]"></PhosphinicAcid>
        <PhosphinicEster Smarts="[PX4;$([H2]),$([H1][#6]),$([H0]([#6])[#6])](=[OX1])[OX2][#6;!$(C=[O,N,S])]"></PhosphinicEster>
        <PhosphinicAmide Smarts="[PX4;$([H2]),$([H1][#6]),$([H0]([#6])[#6])](=[OX1])[#7X3H2,#7X3H1,#7X3H0]"></PhosphinicAmide>
      </PhosphinicAcidDerivative>
      <PhosphonousDerivatives Smarts="[PX3;$([H1]),$([H0][#6])]">
        <PhosphonousAcid Smarts="[PX3;$([H1]),$([H0][#6])]([$([OX2H]),$([OX1-])])[$([OX2H]),$([OX1-])]"></PhosphonousAcid>
        <PhosphonousMonoester Smarts="[PX3;$([H1]),$([H0][#6])]([$([OX2H]),$([OX1-])])[OX2][#6;!$(C=[O,N,S])]"></PhosphonousMonoester>
        <PhosphonousDiester Smarts="[PX3;$([H1]),$([H0][#6])]([OX2][#6;!$(C=[O,N,S])])[OX2][#6;!$(C=[O,N,S])]"></PhosphonousDiester>
        <PhosphonousMonoamide Smarts="[PX3;$([H1]),$([H0][#6])]([$([OX2H]),$([OX1-])])[#7X3H2,#7X3H1,#7X3H0]"></PhosphonousMonoamide>
        <PhosphonousDiamide Smarts="[PX3;$([H1]),$([H0][#6])]([#7X3H2,#7X3H1,#7X3H0])[#7X3H2,#7X3H1,#7X3H0]"></PhosphonousDiamide>
        <PhosphonousEsteramide Smarts="[PX3;$([H1]),$([H0][#6])]([OX2][#6;!$(C=[O,N,S])])[#7X3H2,#7X3H1,#7X3H0]"></PhosphonousEsteramide>
      </PhosphonousDerivatives>
      <PhosphinousDerivatives Smarts="[PX3;$([H2]),$([H1][#6]),$([H0]([#6])[#6])]">
        <PhosphinousAcid Smarts="[PX3;$([H2]),$([H1][#6]),$([H0]([#6])[#6])][$([OX2H]),$([OX1-])]"></PhosphinousAcid>
        <PhosphinousEster Smarts="[PX3;$([H2]),$([H1][#6]),$([H0]([#6])[#6])][OX2][#6;!$(C=[O,N,S])]"></PhosphinousEster>
        <PhosphinousAmide Smarts="[PX3;$([H2]),$([H1][#6]),$([H0]([#6])[#6])][#7X3H2,#7X3H1,#7X3H0]"></PhosphinousAmide>
      </PhosphinousDerivatives>
    </Phosphorus>
    <Sulphur>
      <CarbothioicAcid Smarts="[CX3;$([R0][#6]),$([H1R0])](=[OX1])[$([SX2H]),$([SX1-])]"></CarbothioicAcid>
      <CarbodithioicAcid Smarts="[CX3;!R;$([C][#6]),$([CH]);$([C](=[SX1])[SX2H])]"></CarbodithioicAcid>
      <CarbodithioicEster Smarts="[CX3;!R;$([C][#6]),$([CH]);$([C](=[SX1])[SX2][#6;!$(C=[O,N,S])])]"></CarbodithioicEster>
      <Disulfide Smarts="[SX2D2][SX2D2]"></Disulfide>
      <Isothiocyanate Smarts="[NX2]=[CX2]=[SX1]"></Isothiocyanate>
      <SulfenicDerivative Smarts="[SX2;$([H1]),$([H0][#6])]">
        <SulfenicAcid Smarts="[SX2;$([H1]),$([H0][#6])][$([OX2H]),$([OX1-])]"></SulfenicAcid>
        <SulfenicHalide Smarts="[SX2;$([H1]),$([H0][#6])][FX1,ClX1,BrX1,IX1]"></SulfenicHalide>
        <SulfenicEster Smarts="[SX2;$([H1]),$([H0][#6])][OX2][#6;!$(C=[N,S])]"></SulfenicEster>
        <SulfenicAmide Smarts="[SX2;$([H1]),$([H0][#6])][#7X3H2,#7X3H1,#7X3H0]"></SulfenicAmide>
      </SulfenicDerivative>
      <SulfinicDerivative Smarts="[SX3;$([H1]),$([H0][#6])](=[#8])[!#6]">
        <SulfinicAcid Smarts="[SX3;$([H1]),$([H0][#6])](=[OX1])[$([OX2H]),$([OX1-])]"></SulfinicAcid>
        <SulfinicHalide Smarts="[SX3;$([H1]),$([H0][#6])](=[OX1])[FX1,ClX1,BrX1,IX1]"></SulfinicHalide>
        <SulfinicEster Smarts="[SX3;$([H1]),$([H0][#6])](=[OX1])[OX2][#6;!$(C=[O,N,S])]"></SulfinicEster>
        <SulfinicAmide Smarts="[SX3;$([H1]),$([H0][#6])](=[OX1])[#7X3H2,#7X3H1,#7X3H0]"></SulfinicAmide>
      </SulfinicDerivative>
      <Sulfite Smarts="[$([OX2]),$([OX1-])][SX3](=[OX1])[$([OX2]),$([OX1-])]"></Sulfite>
      <Sulfone Smarts="[$([SX4](=[OX1])(=[OX1])([#6])[#6]),$([SX4+2]([OX1-])([OX1-])([#6])[#6])]"></Sulfone>
      <SulfonicDerivative Smarts="[SX4;$([H1]),$([H0][#6])](=[#8])(=[#8])[!#6]">
        <Sulfonicacid Smarts="[SX4;$([H1]),$([H0][#6])](=[OX1])(=[OX1])[$([OX2H]),$([OX1-])]"></Sulfonicacid>
        <SulfonicHalide Smarts="[SX4;$([H1]),$([H0][#6])](=[OX1])(=[OX1])[FX1,ClX1,BrX1,IX1]"></SulfonicHalide>
        <SulfonicEster Smarts="[SX4;$([H1]),$([H0][#6])](=[OX1])(=[OX1])[OX2][#6;!$(C=[O,N,S])]"></SulfonicEster>
        <SulfonicAmide Smarts="[SX4;$([H1]),$([H0][#6])](=[OX1])(=[OX1])[#7X3H2,#7X3H1,#7X3H0]"></SulfonicAmide>
      </SulfonicDerivative>
      <Sulfonium Smarts="[S+;!$([S]~[!#6]);!$([S]*~[#7,#8,#15,#16])]"></Sulfonium>
      <Sulfoxide Smarts="[$([SX3](=[OX1])([#6])[#6]),$([SX3+]([OX1-])([#6])[#6])]"></Sulfoxide>
      <SulfuricDerivative Smarts="[SX4D4](=[#8])(=[#8])([!#6])[!#6]">
        <SulfuricAcid Smarts="[SX4](=[OX1])(=[OX1])([$([OX2H]),$([OX1-])])[$([OX2H]),$([OX1-])]"></SulfuricAcid>
        <SulfuricMonoester Smarts="[SX4](=[OX1])(=[OX1])([$([OX2H]),$([OX1-])])[OX2][#6;!$(C=[O,N,S])]"></SulfuricMonoester>
        <SulfuricDiester Smarts="[SX4](=[OX1])(=[OX1])([OX2][#6;!$(C=[O,N,S])])[OX2][#6;!$(C=[O,N,S])]"></SulfuricDiester>
        <SulfuricMonoamide Smarts="[SX4](=[OX1])(=[OX1])([#7X3H2,#7X3H1,#7X3H0])[$([OX2H]),$([OX1-])]"></SulfuricMonoamide>
        <SulfuricDiamide Smarts="[SX4](=[OX1])(=[OX1])([#7X3H2,#7X3H1,#7X3H0])[#7X3H2,#7X3H1,#7X3H0]"></SulfuricDiamide>
        <SulfuricEsteramide Smarts="[SX4](=[OX1])(=[OX1])([OX2][#6;!$(C=[O,N,S])])([#7X3H2,#7X3H1,#7X3H0])"></SulfuricEsteramide>
      </SulfuricDerivative>
      <Thioamide Smarts="[$([CX3;!R][#6]),$([CX3H;!R])](=[SX1])[#7X3H2,#7X3H1,#7X3H0]"></Thioamide>
      <Thiocarbonyl Smarts="[CX3]=[SX1]"></Thiocarbonyl>
      <Xanthate Smarts="[CX4][OX2][CX3](=[S])[SX2][CX4]"></Xanthate>
      <Thiocyanate Smarts="[SX2][CX2]#[NX1]"></Thiocyanate>
      <Thioester Smarts="[#6][SX2][CX3](=[O])[#6]"></Thioester>
      <Thiol Smarts="[SX2H]">
        <Aklythiol Smarts="[SX2H][CX4;!$(C([SX2H])~[O,S,#7,#15])]"></Aklythiol>
        <Arylthiol Smarts="[SX2H][c]"></Arylthiol>
      </Thiol>
      <Thiourea Smarts="[NX3][CX3](=[SX1])[NX3]"></Thiourea>
    </Sulphur>
  </Molecule>
 

1 Like

2InsilicoConsulting: thanks very much, this collection of SMARTS is indeed very useful.

Welcome. What would be nice is to have the positions at which the smarts match too!

cheers

The Substructure Matcher node has the "highlight" option, is that what you mean?

 

Regards,

Dmitry

Well yes, but not just highlighting or visualization of the SMARTS/substructures. Rather getting the atom numbers  at which the heavy atoms in a given SMARTS match. So if the SMARTS for benzene/aromatic matches twice, we would have two lists of 6 C atom' s each with their positions/atom numbers e.g. Aromatic ring 1 :2,3,4,5,6,7 and Aromatic ring2 :13,14,15,16,17,18. For each functional group there would be corresponding position information.

I am assuming that the atoms have been numbered canonically, else we would get diff. position numbers every time.

This information can later be used to calculate the shortest path between two rings for example. In general one could find the distance between any pairs of SMARTS/functional groups occuring in a molecule.

Hi,

I can see how this could be useful, but I wouldnt want the numerical column lost which just counts the number of occurrances of the SMARTS match. This counting of functional groups will be very useful for running through the lots of different functional group counts through the models in the Mining section.

I assume the feature you discuss would need to be in an extra column as a String column.

Simon.

yes it would be an extra column. But one can always get the count using the groupby or similar node too just in case the count is absent!

Any progress Dmitry? The indigo nodes are progressing very well in general.

cheers

InsilicoConsulting: The "highlight matches" option has been added to the Substructure Match Counter node today, and you can play with it tomorrow morning when the next nightly build is ready. You can save the highlighted results to (canonical) SMILES to get the numbers of the matched atoms (they come after the "ha:" identifier in SMILES).

As for the "Molecule Functionality" node, it is still in progress...

Best regards,

Dmitry

 

InsilicoConsulting: yes, I think that we will implement such a functionality in the folowing manner: the Substructure Match Counter node will highlight all matches in every input structure; and then the "highlighted" column can be saved into canonical SMILES, in which the indices of highlighted atoms and bonds will be present. Currently, Indigo removes the highlighting from the canonical SMILES, but the next version will keep it.

 

Best regards,

Dmitry

 

I'm glad to hear the "Molecular Functionality" node is in progress. I'm looking forward to it :-)

Simon.

For some smarts, the counts of the "parent" does not add up to the counts of the children. Thus these smarts need to be checked very thouroughly. Also check the list of SMARTS at daylight.

If the above count mismatch occurs, the count of the parent could be the sum of the count of child functional group nodes e.g. carboxylic acid count + alde count +ketone count = Carbonyl count.

Hi Insilico,

Should the example be:

carboxylic acid count + aldehyde count + ketone count + amide count + urea count + carbamate count + carbonate count + carboxylic ester count = Carbonyl count.

 

Simon.

All those that have c=o! In the smarts I submitted the alkane count does not match for primary+seconday+tertiary alkanes = total alkanes [as given by first pattern].  Only the smarts pattern [CX4] is needed.