.. _DDHH: ****************** Digital Humanities ****************** The use of digital technologies to pursue research questions in the humanities. .. | Data for humanities =================== 1) Document markup languages - **ConTeXt** a TeX macro package that has a cleaner interface to control typography of the document while retaining LaTeX's structure-oriented approach - with separation of content and presentation, it can format XML text, ... - **EAD** (*Encoded Archival Description*) - ... 2) Citation - *Arts and Humanities Citation Index* (AHCI) - Machine-readable bibliographic record - MARC, RIS, BibTeX 3) Geospatial/geographical data - **GeoJSON** is a geospatial data interchange format based on *JavaScript Object Notation* (JSON). - **Leaflet** is an open-source JavaScript library for mobile-friendly, cross-browser, interactive maps. - A Web Map Service (**WMS**) is a standard protocol for serving georeferenced map images over the Internet that are generated by a map server using data from a GIS database. - See also Web Feature Service (**WFS**) | .. index:: data-formats Data formats ============ Data can be stored in different formats. | .. _json-str: JSON structure -------------- JSON stands for JavaScript Object Notation and it is based on the JavaScript Programming Language Standard ECMA-262. JSON is built on two structures, namely a collection of name/value pairs, and an ordered list of values. A JSON structure looks like: .. code-block:: Object { Identifier: Value Identifier: Array [ Object { Identifier: Value } ] } Where an ``Identifier`` is delimited by quotes, and a ``Value`` can be a ``string``, a ``number``, ``"true"``, ``"false"``, ``"null"``, or an ``Array`` or another JSON ``Object`` as the above example. JSON in R +++++++++ Some R packages for reading JSON files in CRAN are * ``rjson`` v0.1.0 released on Jul 30 2007 * ``RJSONIO`` v0.3-1 released on Oct 4 2010 * ``jsonlite`` v0.9.0 released on Dec 3 2013 | .. _xml-str: eXtensible Markup Language -------------------------- .. todo:: eXtensible markup language (XML) structure | .. _rst-str: Lightweight markup languages ---------------------------- Lightweight markup languages are for producing documentation on the Web. | Markdown ++++++++ **Markdown** (MD), with suffixes ``.md``, ``.Rmd``, etc., is currently the markup language for GitHub, and hence very popular among developers using this platform. The popularity of this format for writing for the web is however challenging its consistency and robustness, and today there are several flavours of MD: * Basics and syntax of the *"Gruber Markdown"* are in the `creator's webpage `_ * *CommonMark* is an extension of the Gruber Markdown by users including representatives from GitHub, Stack Exchange, and Reddit, and therefore today "de facto" standard on the Web. * *Github Flavored Markdown* or *GFM* is a superset of CommonMark with Github-specific extensions on syntax features. * Other flavours of Markdown include *MultiMarkdown*, *Markdown Extra*, *CriticMarkup*, *Ghost Markdown*, and others... .. .. todo:: TBD | reStructuredText ++++++++++++++++ **reStructuredText** (RST) is written with the suffix ``.rst`` or ``.txt`` since is plaintext, which use simple and intuitive constructs to structure complex technical documentation. Here "complex" means things like indexing, glossaries, etc. One significant innovation of Markdown was the use of headers and interpreted text. However, a step further of RST over MD is the use of *directives* and *specialized roles*. For example, these features allow reStructuredText rendering text and math formulae directly into LaTeX format. The directive syntax in RST is .. code-block:: .. directive-type :: directive block and an illustration of a standard and specialized role is .. code-block:: *emphasis* as standard role :title:*emphasis* with explicit role where (most) of standard roles are common for interpreted text in MD and RST. In order to produce a documentation, either in HTML or in LaTeX, reStructuredText needs a *builder*, which is a program that convert the RST source code into the desired format. Popular builders are the ``Python`` package ``docutils`` with different options: .. code-block:: prompt> ./rst2html.py text.rst > text.html prompt> ./rst2latex.py text.rst > text.tex where RST sources are in a *source* folder and constructs go into a *build* folder. Another alternative is ``Sphinx`` that constructs the API documentation with the two folders and perform the transformation afterwards. .. code-block:: prompt> ./sphinx-build [options] html source build prompt> ./sphinx-build [options] latex source build | TeX and LaTeX ------------- First released in 1978, ``TeX`` is a format that allows typesetting complex mathematical formulae. ``TeX`` is also the engine or program that does the typesetting. ``LaTex`` is a generalised set of macros built on top of ``TeX`` to take care of the content of the document. .. todo:: TODO | Another data format ------------------- .. todo:: TODO Another data format | .. .. seealso:: .. .. :ref:`"Digital tools" in Tools for Humanities and Social Sciences ` .. * See :ref:`"Digital tools" in Tools for Humanities and Social Sciences ` .. meta:: :description: Digital Technologies for Humanities :keywords: digital-humanities, documentation, data-formats