Batch Execution

If you have several dose-response datasets, you can run them as a batch.

For example, consider a CSV with one row per dataset, using commas to separate columns, and semicolons to separate dose groups within a column:

ID,Dose,Incidence,N
1,0;0.5;1,0;3;5,5;5;5
2,0;0.33;0.67;1,0;0;4;5,5;5;5;5

To run in pybmds, you’ll first need to load the dataset into a data frame using the pandas library:

import pandas as pd

df = pd.read_csv("data/batch.csv")
df.head()
ID Dose Incidence N
0 1 0;0.5;1 0;3;5 5;5;5
1 2 0;0.33;0.67;1 0;0;4;5 5;5;5;5

To model, convert the data in a data frame into a list of pybmds.DichotomousDataset objects:

import pybmds


def create_dataset(row):
    return pybmds.DichotomousDataset(
        id=row.ID,
        doses=list(map(float, row.Dose.split(";"))),
        ns=list(map(int, row.N.split(";"))),
        incidences=list(map(int, row.Incidence.split(";"))),
    )


dichotomous_datasets = df.apply(create_dataset, axis=1).tolist()

# plot the first dataset as an example
dichotomous_datasets[0].plot()
../_images/76d1b3d225215f3856062ea6aba2bfb1c145f9f90effde1fbfc6c27976688c6f.png

Single model, multiple datasets

With datasets loaded, we can run a single model for each dataset:

from pybmds.models import dichotomous

dichotomous_results = []
for dataset in dichotomous_datasets:
    model = dichotomous.Multistage(dataset=dataset, settings={"degree": 2})
    result = model.execute()
    dichotomous_results.append(model)

And then we could export a simple list of results:

outputs = [
    [
        model.dataset.metadata.id,
        model.name(),
        model.results.bmd,
        model.results.bmdl,
        model.results.bmdu,
    ]
    for model in dichotomous_results
]
output_df = pd.DataFrame(data=outputs, columns="Dataset-ID Name BMD BMDL BMDU".split())
output_df.head()
Dataset-ID Name BMD BMDL BMDU
0 1 Multistage 2 0.159996 0.023905 0.240851
1 2 Multistage 2 0.192018 0.074073 0.271778

Session batch execution

Alternatively, you could run a session that executes a suite of models and returns the best-fitting result:

# function takes a dataset as input and returns an execution response
def runner(ds):
    sess = pybmds.Session(dataset=ds)
    sess.add_model(pybmds.Models.Logistic, settings={"bmr": 0.2})
    sess.add_model(pybmds.Models.Probit, settings={"bmr": 0.2})
    sess.execute_and_recommend()
    return pybmds.BatchResponse(success=True, content=[sess.to_dict()])


# execute all datasets and sessions on a single processor
batch = pybmds.BatchSession().execute(dichotomous_datasets, runner, nprocs=1)

Save Excel and Word reports:

batch.to_excel("output/batch.xlsx")
batch.to_docx().save("output/batch.docx")

You could even run two sessions for each dataset by, for example, running two different BMRs. The only change to the code above is modifying the runner function:

def runner2(ds):
    sess1 = pybmds.Session(dataset=ds)
    sess1.add_model(pybmds.Models.Logistic, settings={"bmr": 0.1})
    sess1.add_model(pybmds.Models.Probit, settings={"bmr": 0.1})
    sess1.execute_and_recommend()

    sess2 = pybmds.Session(dataset=ds)
    sess2.add_model(pybmds.Models.Logistic, settings={"bmr": 0.2})
    sess2.add_model(pybmds.Models.Probit, settings={"bmr": 0.2})
    sess2.execute_and_recommend()

    return pybmds.BatchResponse(success=True, content=[sess1.to_dict(), sess2.to_dict()])


batch = pybmds.BatchSession().execute(dichotomous_datasets, runner2, nprocs=1)