Extracting EDH
Variables¶
EDH variables¶
Another wrapper function, this time for the extraction of variables from the EDH
dataset is found in edhw()
.
Use data("rp")
for Roman provinces in province.
-
edhw
()¶
Function usage¶
# accepted parameter arguments
R> edhw(vars, x = NULL, as = c("list", "df"), type = c("long", "wide", "narrow"),
split, select, addID, limit, id, na.rm, clean, province, gender, ...)
Parameters¶
Formal arguments of edhw()
are:
vars:
Chosen variables from the
EDH
dataset (vector)x:
An optional list object name with fragments of the
EDH
datasetas:
Format to return the output. Currently either as a
"list"
or a data frame"df"
object.type:
Format of the data frame output. Currently either a
"long"
or a"wide"
table (option"narrow"
not yet implemented).- split:
Divide the data into groups by id? (logical and optional)
- select:
"people"
variables to select (vector and optional, data frame type"long"
only)
addID:
Add identification to the output? (optional and logical)
limit:
Limit the returned output. Ignored if id is specified (optional, integer or vector)
id:
Select only the
hd_nr
id(s) (optional, integer or character)na.rm:
Remove entries with
<NA>
? (logical and optional)clean:
Replace entries with
<NA>
? (logical and optional)province:
Roman province (character, optional) as in
"rp"
dataset.gender:
People gender in
EDH
(character, optional)
Todo
Implement the
"narrow"
type option toedhw()
.
Attributes in EDH
dataset¶
The aim of the edhw()
function is to extract attributes or variables of inscriptions. These inscriptions are output values
typically produced by the get.edh()
or get.edhw()
functions.
The records in the EDH
dataset have at least one the following items:
|
|
|
Another output variable is "people"
that is a list of persons named in the inscriptions with at least the following
items
|
|
Relative dating in EDH¶
We are going to apply the edhw()
function to check the relative dating of Roman inscriptions, and
for a simple relative dating analysis, we choose chronological data variables in vars
:
# make a list for relative variables in 'EDH' (default)
R> edhw(vars=c("not_after", "not_before"))
Since argument x is not specified in the function, the "EDH"
dataset in the sdam
package is taken if available
with a Warning message.
In this case, the boundaries of the timespan of existence are variables "not_after"
and "not_before"
,
respectively.
(see Aoristic analysis for a treatment of timespan of existence.)
Hint
The above use of function
edhw()
is wrapping thebase
lapply
function as# recursively apply a function over the list for variables R> lapply(EDH, `[`, c("not_after", "not_before") )where a pair of backquotes (aka “backticks”) is a way to refer in
R
to names or combinations of symbols that are otherwise reserved or illegal, or non-syntactic names. Hence, e.g.apply(foo, `[`, c(...) )
is the same asapply(foo, function (x) x[c(...)])
.
The structure of such chronological data items is a list object with an id
for all entries that is the EDH
hd_nr
.
R> str(edhw(vars=c("not_after", "not_before"))) #List of 83821 # $ :List of 3 # ..$ id : chr "HD000001" # ..$ not_after : chr "0130" # ..$ not_before: chr "0071" # $ :List of 3 # ..$ id : chr "HD000002" # ..$ not_after : chr "0200" # ..$ not_before: chr "0051" # ...
Complete cases¶
By default, function edhw()
do not remove missing data when present in all variables, but is possible to
remove missing information by activating the na.rm argument and work with complete cases.
# remove missing data R> str(edhw(vars=c("not_after", "not_before"), na.rm=TRUE)) #List of 60224 # $ :List of 3 # ...
However, tackling the temporal uncertainty problem is an important type of analysis.
See also
Missing data within temporal uncertainty.