Little Help for an user from R trying to work with Knime.

Hi everyone, 

After a good research about knime, I decided to try it out. So far, I`m loving it! Its a lot faster to learn than R was! Besides its good to see your workflow, keeps everything organized.

However, as every new tool we start to learn I have some noob questions related to my personal project. 

I'm creating a workflow to analyse the Brazilian Congress open data. I followed a web scrap knime tutorial, and so far so good. However, right now I need to concatenate one row with a URL with the several rows of a column to complete the URL and extract more information. My question is: how can I concatenate one row from one table with each row from another column? How can I copy the same row several times automatically?

Here is an Example to illustrate:

1)Copy several times this row and bind in the same table

Table With URL
Row ID URL
Row_0 http://www.camara.leg.br/SitCamaraWS/Deputados.asmx/ObterDetalhesDeputado?

 

2) To concatenate with the number of rows presented in this table. In my case, there are 513 rows to concatenate.

Table with the parameter to bind
Row Parameter
Row_0 Test1
Row_1 Test2

 

Thanks for your Attention! 

Best Regards! 

Eduardo.

 

 

Hi Eduardo,

If the columns to be appended are coming from two different tables, I would try to use the "Column Appender" node (you will need to make sure that the row keys are consistent between the two tables) and then concatenate the relevant columns of the resulting table by means of  the "Column Combiner" node.

In order to copy the same row several times into the same table, you might want to have a look to the "One Row to Many" node. It will need an integer column containing the number of times your row is to be repeated. An easy way to insert such a column could be to use a "Constant Value Column" node.

Anyway, if I'm understanding correctly your use case, it could be even better to use the "String Manipulation" node right after your second table (the one with the parameters to be appended). The node should be configured to execute the join function, something like:

join("http://www.camara.leg.br/SitCamaraWS/Deputados.asmx/ObterDetalhesDeputado?", $Parameter$)

Hope this helps!

Best Regards,

--

Jorge

 

Hi Menuetto,

 

Your solution worked! Thanks a lot! I managed to access all the XML data that I needed. However, now I have another problem. Some XML files that I extracted have structure problems in their elements. How can I check all of the elements with Knime?

Thanks once more! 

Best Regards, 

Eduardo.

 

Hi Eduardo,

I guess that now you have a table with a column of type String,  containing the XMLs to be checked, one at each row.

If this is the case, I would use the String to XML node. It will convert the  column to type XML , replacing the non-parseable cell values with "?"

Best Regards,

 

--

Jorge

 

 

Menuetto, 

In my case, the problem is that some elements are missing. The problem seems to be the original file. What I want is a node or R script that I can check if all the elements are there. Like this, I can easily find where the errors are and maybe create a node or script that I can fix the problematic elements. 

For example:

</periodoexercicio>
    </periodosexercicio>
    <historiconomeparlamentar>
        <filiacoespartidarias>
            </historicolider>  (in this case, the xml code is missing the element <historicolider>
        </filiacoespartidarias>

Best Regards, 

Eduardo. 

Hi Eduardo,

If the XML is well formed, and the point is to check for the existence of certain elements into the document structure, I would give a try to the XPath node. You will have to feed it with a table having a column containing the XML documents to be checked, and configure the node with a XPath query crafted to extract the elements you are interested in. If the elements are not present on the document for a given row, the XPath query should return an empty value. XPath syntax is well explained into the node description.

 

Best Regards,

 

--

Jorge

 

Jorge, 

I thought to do what you suggested before. And well, It seems to be the best option. Maybe I will try to config another node to report when something is missing after the XPath. 

One last question my friend, is there a prototype node in knime? Something similar do Shiny in R? When I finish this project of mine, I want to prototype it. 

Thanks once more!

 

Eduardo

Hi Eduardo, I'm sorry for the late reply. 

I'm not familiar neither with Shiny nor any similar characteristic of the "vanilla" KNIME Analytics Platform, but I guess it might be somewhat similar to KNIME WebPortal , built into KNIME Server commercial extension (https://www.knime.org/knime-server).

Best Regards,

--

Jorge

 

Hi Jorge, 

 

No problems my friend. Thanks for the reply. I will look the Knime server. 

 

Best Regards,