In my dataset, the granularity plays a crucial role, and I am attempting to In my dataset, the granularity plays a crucial role, and I am attempting to eliminate null values at the end of each column for every granularity. However, the code I executed in KNIME not only removes the null values but also alters the shape of the data for each granularity, resulting in consistent lengths. For example, if the first granularity has 20 non-null rows and another granularity has 25 non-null rows, the code should ideally maintain 20 non-null rows for both granularities, yet it inadvertently alters the data shape
Hi @Rohit_208 , could you give more detail and an example to help people help you.
e.g.
However, the code I executed in KNIME not only removes the null values but also alters the shape of the data for each granularity, resulting in consistent lengths.
- What code have you executed?
For example, if the first granularity has 20 non-null rows and another granularity has 25 non-null rows, the code should ideally maintain 20 non-null rows for both granularities
- Can you upload a small, anonymized, example of data that demonstrates the problem you are having
yet it inadvertently alters the data shape
- Can you show what output you require, and what output you currently get, with a brief description of the important and/or non-obvious differences.
Ideally please upload a workflow that exhibits the problem. thanks.
import pandas as pd
df.replace(0,1e-10,inplace=True)
def Values_remove_last_NaN(df):
df = df.sort_values(āDateā)
numeric_columns = df.select_dtypes(include='number').columns
last_non_nan_index = None
for idx in reversed(df.index):
if df.loc[idx, numeric_columns].notna().any():
last_non_nan_index = idx
break
if last_non_nan_index is not None:
return df.loc[:last_non_nan_index]
return pd.DataFrame(columns=df.columns)
unique_products = df[āPL-1ā].unique()
results =
for product in unique_products:
product_df = df[df[āPL-1ā] == product]
product_df = product_df.sort_values(by='Date')
cleaned_df = Values_remove_last_NaN(product_df)
results.append(cleaned_df)
final_df = pd.concat(results)
this is my code which is give me correct result on jupyter but not on knime python script node.inside data having pl-1 column and inside column having different pl. so i just filter data on unique pl and then try to remove null value if value at last of the column then code working on jupyter but not on knime. after execute code code working in knime like suppose first pl is 13 and inside pl having 20 rows . second pl having 20 rows . these 20 number are non null value after execute code itās remove last value from both data why its remove null value if i did code like if null value at last then remove otherwise keep as it is. right now i am facing this issue.
Hi @Rohit_208 , as mentioned,
-
Can you upload a small, anonymized, example of data that demonstrates the problem you are having
-
Can you show what output you require, and what output you currently get, with a brief description of the important and/or non-obvious differences.
Ideally please upload a workflow that exhibits the problem.
Unfortunately, the python code you uploaded has been garbled a little by the forum software as it hasnāt all been marked as āpreformattedā.
Without the actual code, some sample data, and a clear demonstration of what output you are getting and the output you require I donāt see how anybody is going to have an easy time assisting you. Thanks
hey @takbb ,
i got the correct answer.issue was inside data.
thanks!
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.