Regex Extractor Streamable ?

Hello @qqilihq .

Thanks again for your work on Palladian & Selenium.

I got a quick question: Is there any plan to make Regex Extractor streamable?

Thanks in advance,
Sébastien

Hi Sébastien,

to be honest, we haven’t considered it so far - but as you’re asking there seems to be some reason why it could make sense :slight_smile: Is it simply because you’re processing high volumes of data or are there any other, additional reasons?

We will see how we can implement this (for most nodes this is a no brainer, but the node is not doing a strict 1:1 mapping which makes things a bit harder).

Best,
Philipp

1 Like

We are using it to parse thousands of html pages returned from scraping requests. ( not all information is extractable through xpath :frowning: )
Does that make sense? :slight_smile:

Absolutely, thank you! Let me see what I can do and when. I’ll keep this thread updated.

@sebversailles I’ve added the “streamability” - it’ll be available in v2.8 of the nodes to be released during the next few weeks. If you want to test the pre-release, get in touch a mail@palladian.ai :slight_smile:

–Philipp

4 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.