DEiC’s sciencedata.dk

Accessing DEiC’s (Danish e-Infrastructure Cooperation) sciencedata.dk needs another tools, and this is basically achieved by performing a HTTP request as a client to one of DEiC’s server.

************
|  CLIENT  | -->-->  request  ->-->-->-->-->.
|  ======  |                                |
************                                .
     `                                      |
     ^                                 ************
     `<--<--<--<--<  response  <-- <-- |  SERVER  |
                                       |  ======  |
                                       ************

The server’s response depends on the method used in the HTTP request.


Accessing DEiC’s sciencedata.dk using R

Function request() from the R package “sdam” is aimed to interact with DEiC’s sciencedata.dk

Note that this function requires the [R] package "httr".

Functions Usage

request()
# arguments supported (currently)
R> request(file
          ,URL="https://sciencedata.dk"
          ,method=c("GET","POST","PUT","DELETE")
          ,anonymous=FALSE
          ,path="/files"
          ,cred=NULL
          ,subdomain=NULL
          ,...)

Parameters

  • file (object under ‘method’)

  • URL (protocol and domain of the url)

  • method (the http “verb” for the object)

    "GET" (list)

    "POST" (place)

    "PUT" (update)

    "DELETE" (cancel)

  • anonymous (logical, unauthenticated user?)

  • path (optional path or subdirectory to add to the url)


Additional parameters

  • cred (authentication credentials, vector with username and password)

  • subdomain (optional, add subdomain to the url)

  • (extra parameters if required)


Arguments

Arguments of request() are retrieved with the formals() function.

R> formals(request)
#$file
#
#
#$URL
#[1] "https://sciencedata.dk"
#
#$method
#c("GET", "POST", "PUT", "DELETE")
#
#$anonymous
#[1] FALSE
#
#$cred
#NULL
#
#$path
#[1] "/files"
#
#$subdomain
#NULL

Note

Aliases for request() are sddk() and SDDK().


Output

The output is the server’s response that depends on the method to be used in the request.

A Response message is returned when the method is PUT with the url and items Date, Status, Content-Type.


Details

There are two types of folders in DEiC’s sciencedata.dk that are personal and shared folders and both requires authentication with credentials.

The path to the shared folders where the files are located must be specified with the path argument. However, for personal folders is the file argument that includes the path information.

That is, an [R] code will be like

# personal folders
R> request("path/file")

# shared folders
R> request("file", path="/path")

Many times, DEiC’s sciencedata.dk places the data on a subdomain, and for some request methods like PUT it is needed to specify the subdomain as well.


Authentification

In case that accessing the server requires basic authentification, then package "tcltk" may be needed as well to input the credentials with a widget prompt. request() has the cred argument for performing a basic authentification.

In DEiC’s sciencedata.dk, both personal and shared folders need some sort of authentication. With the basic authentication, the credentials are given with the username and password used under your personal ‘sciencedata.dk’ settings.

Hint

It is possible to prevent the widget by recording this information in a vector object. If you want to avoid a dialog box then save your credentials.

# save authentication credentials
R> mycred <- c("YOUR-AUID@au.dk", "YOURPASSWORD")

However, in many cases such as with public folders in sciencedata.dk authentification is not needed and you can disable it by setting anonymous to TRUE.


Responses

Server responses carry a code called HTTP status code where 2xx means success, and 4xx means client error. There is also a status code like 5xx for server error, and 3xx for redirection (and where codes 1xx are just informative).

Todo

Typical staus codes in the response are 404, 201, 307

When using the request() function, the HTTP status code is given under Status in the response message below the time stamp.


Examples

Some examples of HTTP requests are given next where reponse messages in some cases are given afterwards, and recall that request() requires the httr package.

# load required package
R> require("httr")  # https://cran.r-project.org/package=httr

Method GET

This method is for accessing the files with the data.

# for personal data (in case you have this file)
R> request("df.json", cred=mycred)

#[1] {"a":{"0":"a1","1":"a2"},"b":{"0":"b1","1":"b2"},"c":{"0":"c1","1":"c2"}}
# for shared folders (example Vojtech test folder), where both options work
R> request("df.json", path="/sharingin/648597@au.dk/TEST_shared_folder/", method="GET", cred=mycred)

#[1] {"a":{"0":"a1","1":"a2"},"b":{"0":"b1","1":"b2"},"c":{"0":"c1","1":"c2"}}

Note

If there is any error, then the HTTP status code with the GET method is 200 or OK but it is not returned.


Method PUT

The URL typically includes also a subdomain that for DEiC’s sciencedata.dk is named silo followed by a number. For instance, my personal documents are located in silo1.sciencedata.dk, and other users that will follow are probably located at silo2, etc.

PUT in own folder

For method PUT, the subdomain is mandatory; otherwise the request is redirected.

# for personal data (in my case) I need to specify the subdomain; otherwise it gets redirected!
R> request(system.file("CITATION"), method="PUT", cred=mycred)

# Response [https://sciencedata.dk/files/CITATION]
#  Date: 2020-01-17 13:31
#  Status: 307
#  Content-Type: text/html; charset=UTF-8
#<EMPTY BODY>

The HTTP status code 307 means temporary redirect.

# my data is in subdomain "silo1"
R> request(system.file("CITATION"), method="PUT", cred=mycred, subdomain="silo1")

# Response [https://silo1.sciencedata.dk/files/CITATION]
#  Date: 2020-01-17 13:31
#  Status: 201
#  Content-Type: text/html; charset=UTF-8
#<EMPTY BODY>

The HTTP status code 201 means that the file was created in the server side.

PUT in a sharing folder

# (example Vojtech test folder)

R> request(system.file("CITATION"), path="sharingin/648597@au.dk/TEST_shared_folder",
+    method="PUT", cred=mycred)

# Response [https://sciencedata.dk/sharingin/648597@au.dk/TEST_shared_folder/CITATION]
#  Date: 2020-01-17 13:34
#  Status: 307
#  Content-Type: text/html; charset=UTF-8
#<EMPTY BODY>



R> request(system.file("CITATION"), path="sharingout/648597@au.dk/TEST_shared_folder",
+    method="PUT", cred=mycred)

#Response [https://sciencedata.dk/sharingout/648597%40au.dk/TEST_shared_folder//CITATION]
#  Date: 2020-02-10 09:32
#  Status: 201
#  Content-Type: text/html; charset=UTF-8
#<EMPTY BODY>

Hence, the PUT method for a shared folder needs 'sharingout' in the path; otherwise it gets redirected.

Note

In some cases, the metacharacter @ in the path is “escaped” as %40.


Method DELETE

In the case of accesing with a request using methods GET or PUT, the path in the url is followed by sharingin/USERID/FOLDERNAME, and for DELETE the response is given with sharingout in the path.

# for personal folder
R> request("df.json", method="DELETE", cred=mycred)

# In my case, this is in
#[1] "https://silo1.sciencedata.dk/files/df.json"
# for shared folders (example Vojtech test folder)
R> request("CITATION", path="/sharingin/648597@au.dk/TEST_shared_folder/", method="DELETE", cred=mycred)

#[[1]]
#[1] "https://sciencedata.dk/sharingout/648597%40au.dk/TEST_shared_folder/CITATION"

Method POST

Finally, there is also the possibility to place files with the POST method along with extra information.

R> request(FILE, URL, method="POST")

Typically with a path argument and subdomain if required.

Note

Method POST is not yet implemented in sciencedata.dk