library(groupedHyperframe)1 Grouped Hyper Data Frame
The examples in Chapter 1 require
search path & loadedNamespaces on author’s computer
search()
# [1] ".GlobalEnv" "package:groupedHyperframe" "package:stats" "package:graphics" "package:grDevices" "package:utils" "package:datasets"
# [8] "package:methods" "Autoloads" "package:base"
loadedNamespaces() |> sort.int()
# [1] "abind" "base" "cli" "cluster" "codetools" "compiler" "datasets" "deldir" "digest"
# [10] "doParallel" "dplyr" "evaluate" "farver" "fastmap" "fastmatrix" "foreach" "generics" "geomtextpath"
# [19] "GET" "ggplot2" "glue" "goftest" "graphics" "grDevices" "grid" "gridExtra" "groupedHyperframe"
# [28] "gtable" "htmltools" "htmlwidgets" "iterators" "jsonlite" "knitr" "lattice" "lifecycle" "magrittr"
# [37] "Matrix" "matrixStats" "methods" "nlme" "otel" "parallel" "patchwork" "pillar" "pkgconfig"
# [46] "polyclip" "pracma" "R6" "RColorBrewer" "rlang" "rmarkdown" "rstudioapi" "S7" "scales"
# [55] "SpatialPack" "spatstat.data" "spatstat.explore" "spatstat.geom" "spatstat.random" "spatstat.sparse" "spatstat.univar" "spatstat.utils" "splines"
# [64] "stats" "survival" "systemfonts" "tensor" "textshaping" "tibble" "tidyselect" "tools" "utils"
# [73] "vctrs" "viridisLite" "xfun" "yaml"A hyper data frame (hyperframe, Chapter 25, package spatstat.geom, v3.7.0.6) contains columns that are either atomic vectors, as in a standard data frame, or lists of objects of the same class—referred to as hypercolumns. This data structure is particularly well suited for spatial analysis contexts, such as medical imaging, where each element in a hypercolumn can represent the spatial information contained in a single image. For example, the hyper data frame demohyper (Section 9.9) from package spatstat.data (v3.1.9) contains a regular column Group, a point-pattern (ppp) hypercolumn Points, and a pixel-image (im) hypercolumn Image.
spatstat.data::demohyper
# Hyperframe:
# Points Image Group
# 1 (ppp) (im) a
# 2 (ppp) (im) b
# 3 (ppp) (im) aPackage groupedHyperframe (v0.3.4) introduces the grouped hyper data frame, a hyper data frame augmented with a (nested) grouping structure (Chapter 24).
The author provides a toy dataset wrobel_lung, originally contributed by Dr. Julia Wrobel. Listing 1.1 creates a subset lung0, in which the non-identical column(s) within the lowest-level group image_id (under the nested grouping structure ~patient_id/image_id) are hladr and phenotype.
lung0
lung0 = wrobel_lung |>
within.data.frame(expr = {
x = y = NULL
dapi = NULL
})
lung0
# image_id patient_id gender hladr phenotype OS age
# 1 [40864,18015].im3 #01 0-889-121 F 0.115 CK-.CD8- 3488+ 85
# 2 [40864,18015].im3 #01 0-889-121 F 0.239 CK-.CD8- 3488+ 85
# 3 [40864,18015].im3 #01 0-889-121 F 0.268 CK-.CD8- 3488+ 85
# 4 [40864,18015].im3 #01 0-889-121 F 0.245 CK-.CD8- 3488+ 85
# 5 [40864,18015].im3 #01 0-889-121 F 0.127 CK+.CD8- 3488+ 85
# 6 [40864,18015].im3 #01 0-889-121 F 0.136 CK+.CD8- 3488+ 85
# ✂️ --- output truncated --- ✂️Listing 1.2 creates a grouped hyper data frame lung_g from the data frame lung0 (Listing 1.1) by specifying a (nested) grouping structure (Section 17.1),
lung_g
lung_g = lung0 |>
as.groupedHyperframe(group = ~ patient_id/image_id)Readers may view the grouped hyper data frame lung_g (Listing 1.2) by simply typing lung_g at the R console prompt and pressing Enter (Listing 1.3),
lung_g (Listing 1.2)
lung_g
# Grouped Hyper Data Frame: ~patient_id/image_id
#
# 15 image_id nested in
# 3 patient_id
#
# hladr phenotype image_id patient_id gender OS age
# 1 (numeric) (factor) [40864,18015].im3 #01 0-889-121 F 3488+ 85
# 2 (numeric) (factor) [42689,19214].im3 #01 0-889-121 F 3488+ 85
# 3 (numeric) (factor) [42806,16718].im3 #01 0-889-121 F 3488+ 85
# 4 (numeric) (factor) [44311,17766].im3 #01 0-889-121 F 3488+ 85
# ✂️ --- output truncated --- ✂️Also, readers may view the summary information of the grouped hyper data frame lung_g (Listing 1.2) using the function summary() (Listing 1.4),
lung_g (Listing 1.2)
lung_g |>
summary()
# Grouped Hyper Data Frame: ~patient_id/image_id
#
# 15 image_id nested in
# 3 patient_id
#
# hladr phenotype image_id patient_id gender OS age
# (numeric) (factor) (factor) (factor) (factor) (Surv) (numeric)
# [36953,13765].im3:1 #01 0-889-121:5 F: 5 <time-to-event> :(Surv) Min. :66.00
# [39206,15250].im3:1 #02 1-037-393:5 M:10 [right-censored]:5 1st Qu.:66.00
# [40242,17359].im3:1 #03 2-080-378:5 [observed] :10 Median :84.00
# [40863,16444].im3:1 Mean :78.33
# [40864,18015].im3:1 3rd Qu.:85.00
# [41191,13764].im3:1 Max. :85.00
# (Other) :9Listing 1.5 computes and aggregates the quantiles of each element in the numeric-hypercolumn lung_g$hladr at the biologically independent grouping level patient_id (Section 2.3.3, Section 2.4).
lung_g |>
quantile(probs = seq.int(from = .01, to = .99, by = .01)) |>
aggregate(by = ~ patient_id)
# Hyperframe:
# patient_id gender OS age hladr.quantile
# 1 #01 0-889-121 F 3488+ 85 (numeric)
# 2 #02 1-037-393 M 1605 66 (numeric)
# 3 #03 2-080-378 M 176 84 (numeric)