I am looking for a way to fetch GCV values and tolerance from sentence.
Data that I am using has a free text field where GCV value and tolerance limit is updated as a part of the text. However, the text can be anything that users enter (examples listed below). I need to fetch GCV value and tolerance limit from that into two columns. Any help would be greatly appreciated.
Example 1: Type: Mustard Husk Briquettes GCV: 3600 KCAL /KG. Ash: 12% (+/-2%) Moisture: Upto 10% Loading <(>&<)> unloading scope of vendor.
==> We need 3600 under GCV column and 0 under Tolerance Column
Example 2: Mustard Husk Briquettes GCV: 3600 KCAL (+/-200) Ash: 12% (+/-2%) Moisture: Upto 10% (+/-2%) Packing : 30-50 Kg. Loading <(>&<)> unl
==> Here, we need 3600 under GCV column and 200 under Tolerance Column.
**Example 3:**SUPPLY OF BIOMASS BRIQUETTE OF 100% SAWDUST DUST TO PROVIDE MINIMUM CALORIFIC VALUE OF 4200 (UOM - KGS) ( HSN CODE : 4401 )
==>4200 under GCV and 0 under Tolerance
Example 4: TRIAL ORDER. 100% SAWDUST BRIQUETTES. QUALITY PARAMETERS: MIN.GROSS CALORIFIC VALUE:4000 +/-200 KCAL/KG MOISTURE CONTENT: 7-10 % AS
==> 4200 under GCV and 200 under Tolerance.
Example 4: Special Instructions : - 1. The GCV of Briquettes should be Not less than 3500 Kcal/Kg, if it is more than specification no extra
==>3500 under GCV and 0 under Tolerance
As demonstrated in the examples above, there are lot of such variants. And I was thinking of of using NLP (something similar to Named entity recognition) to get the GCV and tolerance value in two columns. I would like to understand the best way to do this in Knime.
Thanks and regards