Extract bitvector from Rdkit Fingerprint node

Hi everyone,

I’m computing a morgan fingerprint of 16000 bits with the RDKit Fingerprint node and then extract the Bitvector wih the RDKit Fingerprint writer node.
But when i check the length of the bitvector output, it is only 4000 bits long. I do not see any option in that last node to specify the length, so is there another way to write the bitvector correctly ? Or did i missed something in the parameters ?

Plus, i saw that it is a bitvector based on count (there is numbers > 1 in the bitvector) that i did not specify. I just want to have 16000 bits with 0 or 1, even if there is repetition of bit in the molecule but this is not an option in the node. Is that normal ?
And i assume that the letter in the bitvector such as “A” mean that the count of the bit is higher than 9 (“A” would then stand for “10”). Am I right ?

Thanks by advance for any help

Baptiste

You might want to check the output of the fingerprint node. I did a quick test writing out Morgan FP with Rad 2, 4 and the number of bits seems to top out at 8196. Bascially I used string manip to convert FP to string and counted 1 and 0. I didn’t have any letters in mine. There is a possibility that the node is folding the fingerprint because it’s sparse also but I don’t know for sure.

Hope this helps,
Jason

Here is another thread discussing the same thing

and here is an explanation of the fps format

http://chemfp.com/fps_format/

Thanks,
Jason

Hi Baptiste,

From the question it’s a bit tricky for me to figure out what the problem is.
Can you please share a small workflow that demonstrates what you’re doing?

-greg