Process time–concentration dataset for pharmacokinetic analysis
Source:R/processData.R
processData.RdProcesses a time–concentration dataset to derive analysis-ready variables and structured output for pharmacokinetic evaluation.
Arguments
- dat
A data frame containing raw time–concentration data in the standard nlmixr2 format. The following columns are required (case-insensitive) and must be present:
- ID
Subject identifier (required)
- TIME
Nominal or actual time after dose (required)
- DV
Observed concentration (dependent variable) (required)
- EVID
Event ID indicating observation (0) or dosing event (1) (required)
- AMT
Dose amount for dosing records (required)
- RATE
Infusion rate (optional)
- DUR
Infusion duration (optional)
- MDV
Missing dependent variable flag (optional)
- CMT
Compartment number (optional)
- ADDL
Number of additional doses (optional)
- II
Interdose interval (optional)
- SS
Steady-state indicator (optional)
- CENS
Censoring indicator (optional)
- verbose
Logical (default = TRUE). When TRUE, the function prints detailed processing messages and summary tables to the console, including notes on data cleaning and event handling. When FALSE, these messages are suppressed and only the returned list is produced.
Value
A list with two elements:
dat: A data frame containing the processed time–concentration dataset with standardized and derived pharmacokinetic variables, including resetflag, SSflag, route, dose_number, DVstd, indiv_lambda_z_eligible, and others.
Datainfo: A data frame summarizing the dataset structure, including subject counts and observation counts for first-dose and multiple-dose conditions, with contextual notes.
Details
This function standardizes and preprocesses time–concentration data to ensure compatibility with pharmacokinetic modeling workflows in nlmixr2. The operations follow these steps:
Standardize data
convert column names to uppercase
coerce key columns (TIME, DV, EVID, AMT, RATE, etc.) to numeric
Process events and observations
impute EVID from MDV if missing
handle censored data (CENS) by converting them to excluded records
remove or recode invalid EVID values (e.g., DV = 0 observations)
Expand dose events
expand dosing records using nmpkconvert() when ADDL and II are present
assign dose occasions using mark_dose_number()
Determine administration route and infusion logic
derive RATE and DUR when needed
identify route (bolus, infusion, oral) based on compartment and rate
Generate derived variables
calculate time after dose using calculate_tad()
compute dose-normalized concentration (DVstd)
flag eligible records for terminal elimination phase
Summarize dataset
classify dataset as first-dose, repeated-dose, or mixed
generate summary metrics for nlmixr2 analysis
Classification of dosing context is based on pharmacokinetic observation records (EVID equal to 0), determining whether observed concentrations occur after the first dose, during repeated dosing, or across both contexts. The categories are:
first_dose: observations occur only after the initial administration, without repeated-dose or steady-state intervals.
repeated_doses: observations occur only after multiple administrations or under steady-state conditions.
combined_doses: observations include both first-dose and repeated-dose intervals and are analyzed together.
Examples
dat <- Bolus_1CPT
processData(dat)
#>
#>
#> Infometrics Value
#> ---------------------------------------- ---------------
#> Dose Route bolus
#> Dose Type combined_doses
#> Number of Subjects 120
#> Number of Observations 6951
#> Subjects with First-Dose Interval Data 120
#> Observations in the First-Dose Interval 2276
#> Subjects with Multiple-Dose Data 120
#> Observations after Multiple Doses 4675
#> ---------------------------------------- ------
#> $dat
#> # A tibble: 7,911 × 28
#> ID TIME DV LNDV MDV AMT EVID raw_dose V CL SS II
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 1 0 0 0 1 60000 1 60000 65.1 4.07 99 0
#> 2 1 0.25 1126. 7.03 0 0 0 60000 65.1 4.07 99 0
#> 3 1 0.5 870. 6.77 0 0 0 60000 65.1 4.07 99 0
#> 4 1 0.75 884. 6.78 0 0 0 60000 65.1 4.07 99 0
#> 5 1 1 1244 7.13 0 0 0 60000 65.1 4.07 99 0
#> 6 1 1.5 995. 6.90 0 0 0 60000 65.1 4.07 99 0
#> 7 1 2 946. 6.85 0 0 0 60000 65.1 4.07 99 0
#> 8 1 2.5 589. 6.38 0 0 0 60000 65.1 4.07 99 0
#> 9 1 3 754. 6.63 0 0 0 60000 65.1 4.07 99 0
#> 10 1 4 1061. 6.97 0 0 0 60000 65.1 4.07 99 0
#> # ℹ 7,901 more rows
#> # ℹ 16 more variables: SD <int>, CMT <int>, resetflag <int>, raw_EVID <dbl>,
#> # RATE <dbl>, SSflag <int>, route <chr>, dose_number <int>, tad <dbl>,
#> # dose <dbl>, iiobs <dbl>, rateobs <dbl>, routeobs <chr>, durationobs <dbl>,
#> # DVstd <dbl>, indiv_lambdaz_eligible <int>
#>
#> $Datainfo
#> Infometrics Value
#> 1 Dose Route bolus
#> 2 Dose Type combined_doses
#> 3 Number of Subjects 120
#> 4 Number of Observations 6951
#> 5 Subjects with First-Dose Interval Data 120
#> 6 Observations in the First-Dose Interval 2276
#> 7 Subjects with Multiple-Dose Data 120
#> 8 Observations after Multiple Doses 4675
#>