| DemographicArray-class {dembase} | R Documentation |
Classes for representing demographic arrays: esssentially, arrays plus metadata.
DemographicArray is a virtual superclass, and
Counts and Values its two main subclasses. For
a discussion of what these terms mean and of R's class system see
Classes. However, to use package
dembase, it is probably enough to know that the phrase
'objects of classDemographicArray' is shorthand for 'objects of
any class that is a subclass of DemographicArray'. A list of the
subclasses of DemographicArray can be obtained using
getClass(DemographicArray).
Objects of class DemographicArray are arrays with some
specialized metadata that are useful when dealing with population data. For
instance, all objects of class DemographicArray have
dimtypes and dimscales describing the type
of variable being measured and the measurement scale. Objects of
class DemographicArray also have some specialized behaviours that
arrays do not. For instance, when two objects of class
DemographicArray are added together, the dimensions of the two
objects are automatically aligned.
Objects of class Counts hold data about numbers of people or
events, while objects of class Values hold information about
characteristics or attributes. Some functions, such as ones that
aggregate cells, treat objects of class "Counts" differently from
objects of class "Values".
Unlike ordinary arrays, objects of class DemographicArray
must have a complete set of dimnames, meaning that each dimension
must be named, and within a dimension the labels must be
unique.
Objects of class Counts and Values are generated using functions
Counts and Values. Because DemographicArray
is a virtual class, no objects may be created from it.
When demographic arrays are used in arithmetic, or are supplied to a function, one or more of the objects will attempt to reshape themselves so that the objects are compatible. The reshaping involves the following operations:
Dimensions are rearranged so that they follow the same order.
If an object of class "Values" lacks a dimension that others have, the missing dimension is added to that object.
If an object of class
"Counts" has a dimension that others lack, the extra
dimension is collapsed away.
Categories within each dimension are rearranged so that they follow the same order.
If an object of class
"Values" uses coarser intervals than other objects, the coarser
intervals are split. Cells within the new intervals have the same
values as cells within the old combined interval.
If an object of class
"Counts" uses finer intervals than other objects, the finer
intervals are collapsed.
If on object contains categories that another object does not, the extra categories are typically dropped.
If these operations are not sufficient to align objects, then an error is raised. In particular, an error will be raised if the only way to align objects is to remove cells.
The rules for adding dimensions to objects of class "Values",
and for splitting intervals within objects of class "Values",
assume that, within each cell of the original classification, every
person or event is identical. These sorts of homogeneity assumptions
are standard in applied demography. The assumptions are more
plausible when more categories are dimensions are used. Homogeneity
assumptions can be avoided by adding dimensions or splitting intervals
'by hand' with functions such as addDimension.
When there is a mixture of "Counts" and "Value" objects,
there is often a choice collapsing the "Counts" objects and
splitting or adding to the "Values" objects. The default it to
split and add to the "Values" objects, as this preserves all
the original detail while giving the same subtotals.
A function that was designed to work with ordinary arrays will
generally gives an equivalent result when used with a demographic array.
For instance, if a is an array, then sum(a)
equals sum(Counts(a)).
Some methods for demographic arrays include options not
available for ordinary arrays. See, for instance,
as.data.frame and
names.
In some cases, copying the behaviour of ordinary arrays would require
breaking the rules governing dimension names, dimtypes, and
dimscales discussed in dimtypes. See, for instance,
drop.
Function names returns NULL when used
with an ordinary array, but returns the names of the dimensions when
used with a demographic array.
Counts, Values, dimtypes
dimscales. The main new functions for manipulating
demographic arrays are listed in dembase.
a <- array(stats::rpois(n = 6, lambda = 10),
dim = c(3, 2),
dimnames = list(age = c("0-19", "20-64", "65+"),
sex = c("Female", "Male")))
x <- Counts(a)
x
plot(x)
x^2
mean(x)
names(x)
collapseDimension(x, dimension = "age")
b <- array(rnorm(n = 6),
dim = c(2, 3),
dimnames = list(sex = c("Male", "Female"),
age = c("0-19", "20-64", "65+")))
y <- Values(b)
y
## 'y' is automatically reshaped to align to 'x'
x * y
## weights are required with objects of class "Values"
collapseDimension(y, dimension = "age", weights = x)