Tumor and Multitumor Data

Conducting dose-response analysis on dichotomous tumor data differs from analyzing standard dichotomous tumor data in the following ways:

  • The Multistage cancer model uses different parameter settings for model fit than the standard Multistage model.

  • A cancer slope factor is calculated.

  • In some cases, there may be a need to combine multiple tumor datasets and then calculate a single cancer slope factor.

To that end, this guide covers some different approaches that you can use in pybmds for handling tumor data.

Quickstart

To run a single dataset:

import pybmds
from pybmds.models.dichotomous import MultistageCancer

dataset = pybmds.DichotomousDataset(
    doses=[0, 25, 75, 125, 200],
    ns=[20, 20, 20, 20, 20],
    incidences=[0, 1, 7, 15, 19],
    name="Tumor dataset A",
    dose_units="mg/kg-d",
)

model = MultistageCancer(dataset, settings={"bmr": 0.1})
model.execute(slope_factor=True)

print(f"BMD = {model.results.bmd:f}")
print(f"BMDL = {model.results.bmdl:f}")
print(f"CSF = {model.results.slope_factor:f}")

fig = model.plot()
BMD = 35.929912
BMDL = 18.233585
CSF = 0.005484
../_images/2ec43d743b77d5ae17e77f78599755e00bd8b4253d93db6bafe86cecf8e2e4cc.png

To run multiple datasets and calculate a single combined slope factor:

import pybmds

datasets = [
    pybmds.DichotomousDataset(
        doses=[0, 25, 75, 125, 200],
        ns=[20, 20, 20, 20, 20],
        incidences=[0, 1, 7, 15, 19],
        name="Tumor A",
        dose_units="mg/m³",
    ),
    pybmds.DichotomousDataset(
        doses=[0, 25, 75, 125, 200],
        ns=[20, 20, 20, 20, 20],
        incidences=[0, 0, 1, 7, 11],
        name="Tumor B",
        dose_units="mg/m³",
    ),
]

session = pybmds.Multitumor(datasets, settings={"bmr": 0.2}, name="Example")
session.execute()

fig = session.plot()
../_images/8f3b6f1a38f0a8594235dcb2d33a761f92240af964e08b41b4a3bec48df87178.png

To view individual model results for selected models for each dataset:

# Print overall results
print("Overall")
print(f"BMD = {session.results.bmd:f}")
print(f"BMDL = {session.results.bmdl:f}")
print(f"CSF = {session.results.slope_factor:f}")
print()

# Print individual model results
selected_model_indexes = session.results.selected_model_indexes
for i, dataset_models in enumerate(session.models):
    selected_index = selected_model_indexes[i]
    selected_model = dataset_models[selected_index]
    print(f"{selected_model.dataset.metadata.name}: {selected_model.name()}")
    print(f"BMD = {selected_model.results.bmd:f}")
    print(f"BMDL = {selected_model.results.bmdl:f}")
    print(f"CSF = {selected_model.results.slope_factor:f}")
    print()
Overall
BMD = 18.835597
BMDL = 15.066557
CSF = 0.013274

Tumor A: Multistage 1
BMD = 24.653310
BMDL = 18.882341
CSF = 0.010592

Tumor B: Multistage 1
BMD = 79.818267
BMDL = 55.735346
CSF = 0.003588

Create a tumor dataset

Create a tumor dataset using the same method as a dichotomous dataset.

As with a dichotomous dataset, provide a list of doses, incidences, and the total number of subjects, one item per dose-group.

You can also add optional attributes, such as name, dose_name, dose_units, response_name, response_units, etc.

dataset = pybmds.DichotomousDataset(
    name="Chemical X Tumor A",
    dose_units="ppm",
    doses=[0, 25, 75, 125, 200],
    ns=[20, 20, 20, 20, 20],
    incidences=[0, 1, 7, 15, 19],
)

print(dataset.tbl())
fig = dataset.plot()
╒════════╤═════════════╤═════╕
│   Dose │   Incidence │   N │
╞════════╪═════════════╪═════╡
│      0 │           0 │  20 │
│     25 │           1 │  20 │
│     75 │           7 │  20 │
│    125 │          15 │  20 │
│    200 │          19 │  20 │
╘════════╧═════════════╧═════╛
../_images/390da5ad1f55f020ed3b661fa4880eee4a37fbb0d5a971b272243c72110c1a0a.png

Single dataset fit

With a single tumor dataset defined above, you can run a single Multistage cancer model:

import pybmds
from pybmds.models.dichotomous import MultistageCancer

model = MultistageCancer(dataset, settings={"bmr": 0.10, "degree": 2})
model.execute(slope_factor=True)
fig = model.plot()
../_images/b7ca8068a3d4193b08ca6a1c632643ed7dc31e8e0242522422d470d5c908bbf2.png

After executing, results are stored in a results attribute on the model. You can view individual items in the results by accessing:

print(model.name())
print(f"BMD = {session.results.bmd:f}")
print(f"BMDL = {session.results.bmdl:f}")
print(f"CSF = {session.results.slope_factor:f}")
Multistage 2
BMD = 18.835597
BMDL = 15.066557
CSF = 0.013274

Or generate a text report to view a summary:

print(model.text())
      Multistage 2 Model      
══════════════════════════════

Version: pybmds 25.2a2 (bmdscore 25.1)

Input Summary:
╒══════════════════════════════╤════════════════════════╕
│ BMR                          │ 10% Extra Risk         │
│ Confidence Level (one sided) │ 0.95                   │
│ Modeling approach            │ frequentist_restricted │
│ Degree                       │ 2                      │
╘══════════════════════════════╧════════════════════════╛

Parameter Settings:
╒═════════════╤═══════════╤═══════╤═══════╕
│ Parameter   │   Initial │   Min │   Max │
╞═════════════╪═══════════╪═══════╪═══════╡
│ g           │     -17   │   -18 │    18 │
│ b1          │       0.1 │     0 │ 10000 │
│ b2          │       0.1 │     0 │ 10000 │
╘═════════════╧═══════════╧═══════╧═══════╛

Modeling Summary:
╒════════════════╤══════════════╕
│ BMD            │  35.9299     │
│ BMDL           │  18.2336     │
│ BMDU           │  42.1027     │
│ Slope Factor   │   0.00548438 │
│ AIC            │  68.4561     │
│ Log-Likelihood │ -32.2281     │
│ P-Value        │   0.979879   │
│ Overall d.f.   │   3          │
│ Chi²           │   0.185606   │
╘════════════════╧══════════════╛

Model Parameters:
╒════════════╤═════════════╤════════════╤══════════════╕
│ Variable   │    Estimate │ On Bound   │ Std Error    │
╞════════════╪═════════════╪════════════╪══════════════╡
│ g          │ 1.523e-08   │ yes        │ Not Reported │
│ b1         │ 3.10908e-05 │ no         │ 0.279639     │
│ b2         │ 8.07489e-05 │ no         │ 0.795918     │
╘════════════╧═════════════╧════════════╧══════════════╛
Standard errors estimates are not generated for parameters estimated on corresponding bounds,
although sampling error is present for all parameters, as a rule. Standard error estimates may not
be reliable as a basis for confidence intervals or tests when one or more parameters are on bounds.


Goodness of Fit:
╒════════╤════════╤════════════╤════════════╤════════════╤═══════════════════╕
│   Dose │   Size │   Observed │   Expected │   Est Prob │   Scaled Residual │
╞════════╪════════╪════════════╪════════════╪════════════╪═══════════════════╡
│      0 │     20 │          0 │  3.046e-07 │  1.523e-08 │      -0.000551905 │
│     25 │     20 │          1 │  0.999088  │  0.0499544 │       0.000935604 │
│     75 │     20 │          7 │  7.33062   │  0.366531  │      -0.153425    │
│    125 │     20 │         15 │ 14.3585    │  0.717926  │       0.318744    │
│    200 │     20 │         19 │ 19.2137    │  0.960686  │      -0.245902    │
╘════════╧════════╧════════════╧════════════╧════════════╧═══════════════════╛

Analysis of Deviance:
╒═══════════════╤══════════════════╤════════════╤════════════╤═════════════╤═════════════╕
│ Model         │   Log-Likelihood │   # Params │ Deviance   │ Test d.f.   │ P-Value     │
╞═══════════════╪══════════════════╪════════════╪════════════╪═════════════╪═════════════╡
│ Full model    │         -32.1362 │          5 │ -          │ -           │ -           │
│ Fitted model  │         -32.2281 │          2 │ 0.183639   │ 3           │ 0.980186    │
│ Reduced model │         -68.0292 │          1 │ 71.7859    │ 4           │ 9.54792e-15 │
╘═══════════════╧══════════════════╧════════════╧════════════╧═════════════╧═════════════╛

Change input settings

Model settings can be customized for a run, as with standard dichotomous models.

model = MultistageCancer(
    dataset,
    settings={
        "bmr_type": pybmds.DichotomousRiskType.AddedRisk,
        "bmr": 0.15,
        "degree": 3,
    },
)
print(model.settings.tbl())
╒══════════════════════════════╤════════════════════════╕
│ BMR                          │ 15% Added Risk         │
│ Confidence Level (one sided) │ 0.95                   │
│ Modeling approach            │ frequentist_restricted │
│ Degree                       │ 3                      │
╘══════════════════════════════╧════════════════════════╛

Change parameter settings

Initial parameter settings are different for the MultistageCancer model compared with the dichotomous Multistage:

from pybmds.models.dichotomous import Multistage, MultistageCancer

model = Multistage(dataset)
print("Multistage parameter settings:")
print(model.priors_tbl())

model = MultistageCancer(dataset)
print("Multistage Cancer parameter settings:")
print(model.priors_tbl())
Multistage parameter settings:
╒═════════════╤═══════════╤═══════╤═══════╕
│ Parameter   │   Initial │   Min │   Max │
╞═════════════╪═══════════╪═══════╪═══════╡
│ g           │         0 │   -18 │    18 │
│ b1          │         0 │     0 │ 10000 │
│ b2          │         0 │     0 │ 10000 │
╘═════════════╧═══════════╧═══════╧═══════╛
Multistage Cancer parameter settings:
╒═════════════╤═══════════╤═══════╤═══════╕
│ Parameter   │   Initial │   Min │   Max │
╞═════════════╪═══════════╪═══════╪═══════╡
│ g           │     -17   │   -18 │    18 │
│ b1          │       0.1 │     0 │ 10000 │
│ b2          │       0.1 │     0 │ 10000 │
╘═════════════╧═══════════╧═══════╧═══════╛

For Multistage models, the b2 parameter setting is reused for all beta parameters greater than or equal to b2.

These can be updated:

model.settings.priors.update("g", initial_value=0, min_value=-10, max_value=10)
model.settings.priors.update("b1", initial_value=10, min_value=0, max_value=100)
model.settings.priors.update("b2", initial_value=20, min_value=0, max_value=1000)

print(model.priors_tbl())
╒═════════════╤═══════════╤═══════╤═══════╕
│ Parameter   │   Initial │   Min │   Max │
╞═════════════╪═══════════╪═══════╪═══════╡
│ g           │         0 │   -10 │    10 │
│ b1          │        10 │     0 │   100 │
│ b2          │        20 │     0 │  1000 │
╘═════════════╧═══════════╧═══════╧═══════╛

Fit multiple models

The previous example runs a single Multitumor model to a single dataset. However, you may want, for example, to run multiple multitumor models of varying degrees to a single dataset.

Multiple dataset fit

To fit multiple models and one or more datasets, use an instance of the Multitumor class:

import pybmds

datasets = [
    pybmds.DichotomousDataset(
        doses=[0, 25, 75, 125, 200],
        ns=[20, 20, 20, 20, 20],
        incidences=[0, 1, 7, 15, 19],
        name="Tumor A",
        dose_units="mg/m³",
    ),
    pybmds.DichotomousDataset(
        doses=[0, 25, 75, 125, 200],
        ns=[20, 20, 20, 20, 20],
        incidences=[0, 0, 1, 7, 11],
        name="Tumor B",
        dose_units="mg/m³",
    ),
]

session = pybmds.Multitumor(datasets)
session.execute()

print(session.results.tbl())
fig = session.plot()
╒══════════════════════════════════╤════════════╕
│ BMD                              │   8.8935   │
│ BMDL                             │   7.1139   │
│ BMDU                             │  11.2427   │
│ Slope Factor                     │   0.014057 │
│ Combined Log-likelihood          │ -70.8752   │
│ Combined Log-likelihood Constant │ -53.1841   │
╘══════════════════════════════════╧════════════╛
../_images/5aa15bf22b4b6c0be113565e20f324b748e87f955ba6ab4b826f44a0b4d0b176.png

You can generate Excel and Word exports:

# save excel report
df = session.to_df()
df.to_excel("output/report.xlsx")

# save to a word report
report = session.to_docx()
report.save("output/report.docx")

Change model settings

Settings for all datasets and models should be configured globally and are applied to all models:

session = pybmds.Multitumor(datasets, settings={
    "bmr_type": pybmds.DichotomousRiskType.AddedRisk,
    "bmr": 0.15,
})

Change model degree

By default, multiple models are executed for each dataset, where the degree is varied from 1 to the number of doses minus 1 (and a maximum of 8).

For this example, we first create three datasets:

datasets = [
    pybmds.DichotomousDataset(
        doses=[0, 2, 3, 4, 5, 6, 7, 8, 9],
        ns=[20, 20, 20, 20, 20, 20, 20, 20, 20],
        incidences=[0, 1, 4, 8, 11, 12, 13, 14, 15],
        name="Tumor A (9 groups)",
        dose_units="mg/m³",
    ),
    pybmds.DichotomousDataset(
        doses=[0, 2, 3, 4, 5, 6, 7, 8, 9],
        ns=[20, 20, 20, 20, 20, 20, 20, 20, 20],
        incidences=[0, 1, 7, 15, 19, 19, 19, 19, 19],
        name="Tumor B (9 groups)",
        dose_units="mg/m³",
    ),
    pybmds.DichotomousDataset(
        doses=[0, 2, 3, 4, 5],
        ns=[20, 20, 20, 20, 20],
        incidences=[0, 0, 1, 7, 11],
        name="Tumor C (5 groups)",
        dose_units="mg/m³",
    ),
]

Next, we specify which model degrees to run for each dataset using degrees. Setting a value of 0 runs all degrees available up to a maximum of 8; specifying a specific degree will only run the specified degree.

degrees = [0, 3, 2]
session = pybmds.Multitumor(datasets, degrees=degrees)
session.execute()
fig = session.plot()
../_images/f4cbf66a95dab97bad2ae39856877a91c4445c69f6745bac6ab05c9564945216.png

The analysis executed the following models for each dataset:

for dataset_models in session.models:
    print(f"{dataset_models[0].dataset.metadata.name}")
    for model in dataset_models:
        print("\t" + model.name())
Tumor A (9 groups)
	Multistage 1
	Multistage 2
	Multistage 3
	Multistage 4
	Multistage 5
	Multistage 6
	Multistage 7
	Multistage 8
Tumor B (9 groups)
	Multistage 3
Tumor C (5 groups)
	Multistage 2

Note that the Multistage and MultistageCancer models both use the standard model selection workflow for dichotomous data (see BMDS User Guide), whereas the Multitumor model uses a modifed workflow specific to use of the multistage model specifically for cancer endpoints. If users would like to apply the modified Multistage model selection workflow on individual tumors, they can simply run the Multitumor model on single tumor datasets.

For illustration, consider the dataset below, modeled with the Multistage Cancer model:

from pybmds.models.dichotomous import MultistageCancer

esophogeal_tumors_female = pybmds.DichotomousDataset(
        doses=[0, 0.002, 0.004, 0.009, 0.018, 0.036, 0.072, 0.107, 0.143],
        ns=[240, 60, 60, 60, 60, 60, 60, 60, 60],
        incidences=[0, 0, 0, 0, 0, 0, 2, 8, 16],
        name="Esophogeal tumors - females",
        dose_units="mg/kg-d",
)

model = MultistageCancer(dataset, settings={"bmr": 0.1})
model.execute(slope_factor=True)

print(f"BMD = {model.results.bmd:f}")
print(f"BMDL = {model.results.bmdl:f}")
print(f"CSF = {model.results.slope_factor:f}")

fig = model.plot()
print(model.text())
BMD = 35.929912
BMDL = 18.233585
CSF = 0.005484
      Multistage 2 Model      
══════════════════════════════

Version: pybmds 25.2a2 (bmdscore 25.1)

Input Summary:
╒══════════════════════════════╤════════════════════════╕
│ BMR                          │ 10% Extra Risk         │
│ Confidence Level (one sided) │ 0.95                   │
│ Modeling approach            │ frequentist_restricted │
│ Degree                       │ 2                      │
╘══════════════════════════════╧════════════════════════╛

Parameter Settings:
╒═════════════╤═══════════╤═══════╤═══════╕
│ Parameter   │   Initial │   Min │   Max │
╞═════════════╪═══════════╪═══════╪═══════╡
│ g           │     -17   │   -18 │    18 │
│ b1          │       0.1 │     0 │ 10000 │
│ b2          │       0.1 │     0 │ 10000 │
╘═════════════╧═══════════╧═══════╧═══════╛

Modeling Summary:
╒════════════════╤══════════════╕
│ BMD            │  35.9299     │
│ BMDL           │  18.2336     │
│ BMDU           │  42.1027     │
│ Slope Factor   │   0.00548438 │
│ AIC            │  68.4561     │
│ Log-Likelihood │ -32.2281     │
│ P-Value        │   0.979879   │
│ Overall d.f.   │   3          │
│ Chi²           │   0.185606   │
╘════════════════╧══════════════╛

Model Parameters:
╒════════════╤═════════════╤════════════╤══════════════╕
│ Variable   │    Estimate │ On Bound   │ Std Error    │
╞════════════╪═════════════╪════════════╪══════════════╡
│ g          │ 1.523e-08   │ yes        │ Not Reported │
│ b1         │ 3.10908e-05 │ no         │ 0.279639     │
│ b2         │ 8.07489e-05 │ no         │ 0.795918     │
╘════════════╧═════════════╧════════════╧══════════════╛
Standard errors estimates are not generated for parameters estimated on corresponding bounds,
although sampling error is present for all parameters, as a rule. Standard error estimates may not
be reliable as a basis for confidence intervals or tests when one or more parameters are on bounds.


Goodness of Fit:
╒════════╤════════╤════════════╤════════════╤════════════╤═══════════════════╕
│   Dose │   Size │   Observed │   Expected │   Est Prob │   Scaled Residual │
╞════════╪════════╪════════════╪════════════╪════════════╪═══════════════════╡
│      0 │     20 │          0 │  3.046e-07 │  1.523e-08 │      -0.000551905 │
│     25 │     20 │          1 │  0.999088  │  0.0499544 │       0.000935604 │
│     75 │     20 │          7 │  7.33062   │  0.366531  │      -0.153425    │
│    125 │     20 │         15 │ 14.3585    │  0.717926  │       0.318744    │
│    200 │     20 │         19 │ 19.2137    │  0.960686  │      -0.245902    │
╘════════╧════════╧════════════╧════════════╧════════════╧═══════════════════╛

Analysis of Deviance:
╒═══════════════╤══════════════════╤════════════╤════════════╤═════════════╤═════════════╕
│ Model         │   Log-Likelihood │   # Params │ Deviance   │ Test d.f.   │ P-Value     │
╞═══════════════╪══════════════════╪════════════╪════════════╪═════════════╪═════════════╡
│ Full model    │         -32.1362 │          5 │ -          │ -           │ -           │
│ Fitted model  │         -32.2281 │          2 │ 0.183639   │ 3           │ 0.980186    │
│ Reduced model │         -68.0292 │          1 │ 71.7859    │ 4           │ 9.54792e-15 │
╘═══════════════╧══════════════════╧════════════╧════════════╧═════════════╧═════════════╛
../_images/b7ca8068a3d4193b08ca6a1c632643ed7dc31e8e0242522422d470d5c908bbf2.png

In this case, the 2nd degree Multistage model has been selected based on the standard dichotomous model selection workflow.

If this same tumor dataset was modeled along with another tumor dataset with the Multitumor model, the results would be:

hepatocellular_tumors_female = pybmds.DichotomousDataset(
    doses=[0, 0.002, 0.004, 0.009, 0.018, 0.036, 0.072, 0.107],
    ns=[240, 60, 60, 60, 60, 60, 60, 60],
    incidences=[13, 4, 6, 3, 6, 11, 35, 48],
    name="Hepatocellular tumors - females",
    dose_units="mg/kg-d",
)

datasets = [
    hepatocellular_tumors_female,
    esophogeal_tumors_female
]

session = pybmds.Multitumor(datasets)
session.execute()

print(session.results.tbl())
fig = session.plot()
nlopt failed: bounds 0 fail -1.80631 <= -1.80631 <= -1.94491
╒══════════════════════════════════╤═══════════════╕
│ BMD                              │     0.0234891 │
│ BMDL                             │     0.021427  │
│ BMDU                             │     0.0234891 │
│ Slope Factor                     │     4.667     │
│ Combined Log-likelihood          │  -292.531     │
│ Combined Log-likelihood Constant │ -9999         │
╘══════════════════════════════════╧═══════════════╛
../_images/174ca7109cc24cbf782869807995157f6fc39f07c2a748474d4e31763501c882.png

Observe that when the esophogeal tumors are modeled with the Multitumor model, the 1st degree Multistage model is selected using the modified Multistage model selection workflow.

Users can apply the modified workflow to single tumor datasets by simply running the Multitumor model on indiviudal datasets:

datasets = [esophogeal_tumors_female]
session = pybmds.Multitumor(datasets)
session.execute()

print(session.results.tbl())
fig = session.plot()
nlopt failed: bounds 0 fail -0.482868 <= -0.482868 <= -1.94491
╒══════════════════════════════════╤═══════════════╕
│ BMD                              │     0.0882326 │
│ BMDL                             │     0.0649418 │
│ BMDU                             │     0.0882326 │
│ Slope Factor                     │     1.53984   │
│ Combined Log-likelihood          │   -75.673     │
│ Combined Log-likelihood Constant │ -9999         │
╘══════════════════════════════════╧═══════════════╛
../_images/8d3e00b0a06eb2c5d7bfd098915377781e1136a93958e31fb0b83e9e4119f57c.png

PolyK Adjustment for Early Mortality

Often, in cancer bioassays, exposure to the carcinogen of interest will result in differential survival across exposure groups, with animals in higher exposure groups dying for frequently and at earlier time points. If this is not taken into account, the unadjusted data will underestimate the true proportion of animals developing the tumor of incidence and risks will be also be underestimated. See BMDS User Guide for full details on the PolyK adjustment methods.

Consider a tumor dataset saved as a CSV with dose, day of death, and tumor status (i.e., 1 = has tumor and 0 = no tumor):

import pandas as pd

df = pd.read_csv('data/example-polyk.csv')
df.head()
dose day has_tumor
0 0.0 452 0
1 0.0 535 0
2 0.0 553 0
3 0.0 570 0
4 0.0 596 0

The PolyK adjustment can be run by calling the PolyKAdjustment class. Note that the columns in your csv corresponding to doses, days, and has_tumor variables must be defined. The polynomial degree (k) can be set by the user, but the method defaults to a value of 3. The maximum number of days on test in the bioassay (max_days) can also be manually defined by the user, but this variable will automatically be set to the highest value in the days column otherwise.

from pybmds.datasets.transforms.polyk import PolyKAdjustment

# Define which columns in the dataframe correspond to dose, time (day), and response (has_tumor)
doses = df["dose"].tolist()
days = df["day"].tolist()
has_tumor = df["has_tumor"].tolist()

# Create an instance of PolyKAdjustment
poly_k = PolyKAdjustment(
    doses=doses,
    day=days,
    has_tumor=has_tumor,
    k=3,                # Optional, default is 3
    max_day=None,       # Optional, will use max day from data if None
    dose_units="",
    dose_name="Dose",
    time_units="days"
)

# Print summary table to see adjusted N and proportions
print(poly_k.summary)

# Plot summary figure
fig1 = poly_k.summary_figure()
fig2 = poly_k.tumor_incidence_figure()
   dose     n      adj_n  incidence  proportion  adj_proportion
0   0.0  50.0  42.421583        6.0        0.12        0.141437
1  12.8  50.0  42.303325       12.0        0.24        0.283666
2  32.0  50.0  40.347370       23.0        0.46        0.570050
3  80.0  50.0  42.144682       28.0        0.56        0.664378
../_images/4983ec244e2a5beba442f33f60567d09b301aa75def7e9abcc57c4151a59fd16.png ../_images/9617f1a64918c479293ca05ff3ab1b529c05c354aa1670a325fcfe871b93f431.png

The adjusted Ns and original incidence values can now be used in a MultistageCancer or Multitumor analysis and the resulting BMD, BMDL, and cancer slope factor will reflect the increased proportion of tumor-bearing animals due to differential survival across dose groups.