LOD LOQ and linear regression

kzukowski · April 5, 2024, 10:22am

I have a practical question regarding the determination of the LOD and LOQ. I have data for dose response curves for many samples in several tests. Table: Assay, Conditions, Sample, Concentration, response. The table contains a number of points in the TOP and BOTTOM of this response. Do you know of a tool that would determine these parameters based on SLOPE and Sy (at least to determine these SLOPE and Sy) for each combination: Assay, Conditions, Sample, Concentration also filtering out points where there is no response or where there is a maximum response, fitting a linear regression to them and determining these parameters of the fitted curve? Thank you in advance for your help

mwiegand · April 26, 2024, 7:27pm

Hi @kzukowski,

albeit being a rather unorthodox reply, as I used Chat GPT for guidance and to learn something new, I hope the bots answer provides at lest a starting point for you.

Though, on firth thought, I thought the Statistics Node might be of help:

========
Response Start

For determining the Limit of Detection (LOD) and Limit of Quantitation (LOQ) based on dose-response curve data, you can use several software tools and statistical methods. Your requirements—to filter data, fit linear regression models to subsets of the data, and compute relevant statistics such as the slope and standard error of the estimate (Sy)—can be handled using a combination of data management and statistical analysis tools. Here’s a general approach using KNIME, which was mentioned in the link you’re referring to, or other tools like R or Python:

Using KNIME:

Data Preprocessing:

Use the Row Filter node to exclude data points where there is no response or maximum response.
Use the GroupBy node to segment data by Assay, Conditions, Sample, and Concentration.

Linear Regression:

Use the Linear Regression Learner node to fit a regression model to each group of data. This node can also provide the slope (coefficient of the independent variable) and Sy (standard error of the residuals).

Calculate LOD and LOQ:

LOD can be determined as 𝐿𝑂𝐷=3.3×𝜎/𝑆LOD=3.3×σ/S where 𝜎σ is the standard deviation of the response and 𝑆S is the slope of the calibration curve.
LOQ is typically calculated as 𝐿𝑂𝑄=10×𝜎/𝑆LOQ=10×σ/S.
You can calculate these using the Math Formula node to apply these formulas to the data.

Using R or Python:

Here’s a basic example of how you might approach this in Python using libraries like pandas for data manipulation and statsmodels for fitting linear regression models:

python

Copy code

import pandas as pd
import statsmodels.api as sm

# Load your data
data = pd.read_csv('your_data.csv')

# Filter out non-responsive or maximum response points
filtered_data = data[(data['response'] > 0) & (data['response'] < maximum_response_threshold)]

# Group data and fit linear regression
results = {}
for name, group in filtered_data.groupby(['Assay', 'Conditions', 'Sample', 'Concentration']):
    # Assuming 'Concentration' is the independent variable and 'response' is the dependent variable
    X = sm.add_constant(group['Concentration'])  # Adds a constant term for the intercept
    model = sm.OLS(group['response'], X).fit()
    slope = model.params['Concentration']
    sy = model.bse['Concentration']
    sigma = model.resid.std()
    lod = 3.3 * sigma / slope
    loq = 10 * sigma / slope

    results[name] = {'slope': slope, 'sy': sy, 'LOD': lod, 'LOQ': loq}

# Convert results to a DataFrame for easier viewing/manipulation
results_df = pd.DataFrame.from_dict(results, orient='index')
print(results_df)

Recommendations:

Experiment with different statistical models depending on the nature of your data. If the relationship between concentration and response is not linear, consider using nonlinear regression models.
Consider using robust regression techniques if your data contains outliers that could affect the fit of the standard linear regression model.
Use visual tools to inspect the fit of your models to ensure that they adequately describe the data. This could include plotting the fitted values against the actual values.

If you need further guidance on setting up these analyses in a specific software or interpreting the results, feel free to ask!

========
Response End

Best
Mike