Extracting `EDH` Variables¶

EDH variables¶

Another wrapper function, this time for the extraction of variables from the EDH dataset is found in edhw(). Use data("rp") for Roman provinces in province.

edhw()¶

Function usage¶

# accepted parameter arguments
R> edhw(vars, x = NULL, as = c("list", "df"), type = c("long", "wide", "narrow"),
        split, select, addID, limit, id, na.rm, clean, province, gender, ...)

Parameters¶

Formal arguments of edhw() are:

vars:

Chosen variables from the EDH dataset (vector)
x:

An optional list object name with fragments of the EDH dataset
as:

Format to return the output. Currently either as a "list" or a data frame "df" object.
type:

Format of the data frame output. Currently either a "long" or a "wide" table (option "narrow" not yet implemented).
split:
Divide the data into groups by id? (logical and optional)
select:
"people" variables to select (vector and optional, data frame type "long" only)
addID:

Add identification to the output? (optional and logical)
limit:

Limit the returned output. Ignored if id is specified (optional, integer or vector)
id:

Select only the hd_nr id(s) (optional, integer or character)
na.rm:

Remove entries with <NA>? (logical and optional)
clean:

Replace entries with <NA>? (logical and optional)
province:

Roman province (character, optional) as in "rp" dataset.
gender:

People gender in EDH (character, optional)

Todo

Implement the "narrow" type option to edhw().

Attributes in `EDH` dataset¶

The aim of the edhw() function is to extract attributes or variables of inscriptions. These inscriptions are output values typically produced by the get.edh() or get.edhw() functions.

The records in the EDH dataset have at least one the following items:

"commentary" "fotos" "country" "depth"

"diplomatic_text" "edh_geography_uri" "findspot" "findspot_ancient" "findspot_modern"

"geography" "height" "id" "language" "last_update" "letter_size" "literature"

"material" "military" "modern_region" "not_after" "not_before" "present_location"

"religion" "province_label" "responsible_individual" "social_economic_legal_history"

"transcription" "trismegistos_uri" "type_of_inscription" "type_of_monument" "uri"

"width" "work_status" "year_of_find"

Another output variable is "people" that is a list of persons named in the inscriptions with at least the following items

"person_id" "nomen" "cognomen" "praenomen" "name" "gender" "status" "tribus"

"origo" "occupation" "age: years" "age: months" "age: days"

Relative dating in EDH¶

We are going to apply the edhw() function to check the relative dating of Roman inscriptions, and for a simple relative dating analysis, we choose chronological data variables in vars:

# make a list for relative variables in 'EDH' (default)
R> edhw(vars=c("not_after", "not_before"))

Since argument x is not specified in the function, the "EDH" dataset in the sdam package is taken if available with a Warning message.

In this case, the boundaries of the timespan of existence are variables "not_after" and "not_before", respectively.

(see Aoristic analysis for a treatment of timespan of existence.)

Hint

The above use of function edhw() is wrapping the base lapply function as
# recursively apply a function over the list for  variables
R> lapply(EDH, `[`, c("not_after", "not_before") )
where a pair of backquotes (aka “backticks”) is a way to refer in R to names or combinations of symbols that are otherwise reserved or illegal, or non-syntactic names. Hence, e.g. apply(foo, `[`, c(...) ) is the same as apply(foo, function (x) x[c(...)]).

The structure of such chronological data items is a list object with an id for all entries that is the EDH hd_nr.

R> str(edhw(vars=c("not_after", "not_before")))
#List of 83821
# $ :List of 3
#  ..$ id        : chr "HD000001"
#  ..$ not_after : chr "0130"
#  ..$ not_before: chr "0071"
# $ :List of 3
#  ..$ id        : chr "HD000002"
#  ..$ not_after : chr "0200"
#  ..$ not_before: chr "0051"
# ...

Complete cases¶

By default, function edhw() do not remove missing data when present in all variables, but is possible to remove missing information by activating the na.rm argument and work with complete cases.

# remove missing data
R> str(edhw(vars=c("not_after", "not_before"), na.rm=TRUE))
#List of 60224
# $ :List of 3
# ...

However, tackling the temporal uncertainty problem is an important type of analysis.

See also

Missing data within temporal uncertainty.

Table of Contents

Previous topic

Next topic

This Page

Extracting `EDH` Variables¶

EDH variables¶

Function usage¶

Parameters¶

Attributes in `EDH` dataset¶

Relative dating in EDH¶

Complete cases¶

Extracting EDH Variables¶

EDH variables¶

Function usage¶

Parameters¶

Attributes in EDH dataset¶

Relative dating in EDH¶

Complete cases¶

Extracting `EDH` Variables¶

Attributes in `EDH` dataset¶