DEiC’s sciencedata.dk¶
Accessing DEiC’s (Danish e-Infrastructure Cooperation) sciencedata.dk needs another tools, and this is basically achieved by performing a HTTP request as a client to one of DEiC’s server.
************ | CLIENT | -->--> request ->-->-->-->-->. | ====== | | ************ . ` | ^ ************ `<--<--<--<--< response <-- <-- | SERVER | | ====== | ************
The server’s response depends on the method used in the HTTP request.
Accessing DEiC’s sciencedata.dk using R¶
Function request()
from the R package “sdam” is aimed to interact
with DEiC’s sciencedata.dk
Note that this function requires the [R] package "httr"
.
Functions Usage¶
-
request
()¶
# arguments supported (currently)
R> request(file
,URL="https://sciencedata.dk"
,method=c("GET","POST","PUT","DELETE")
,anonymous=FALSE
,path="/files"
,cred=NULL
,subdomain=NULL
,...)
Parameters¶
file (object under ‘method’)
URL (protocol and domain of the url)
method (the http “verb” for the object)
"GET"
(list)"POST"
(place)"PUT"
(update)"DELETE"
(cancel)anonymous (logical, unauthenticated user?)
path (optional path or subdirectory to add to the url)
Additional parameters¶
cred (authentication credentials, vector with username and password)
subdomain (optional, add subdomain to the url)
… (extra parameters if required)
Arguments¶
Arguments of request()
are retrieved with the formals()
function.
R> formals(request)
#$file
#
#
#$URL
#[1] "https://sciencedata.dk"
#
#$method
#c("GET", "POST", "PUT", "DELETE")
#
#$anonymous
#[1] FALSE
#
#$cred
#NULL
#
#$path
#[1] "/files"
#
#$subdomain
#NULL
Note
Aliases for
request()
aresddk()
andSDDK()
.
Output¶
The output is the server’s response that depends on the method to be used in the request.
A Response
message is returned when the method is PUT
with the url and items
Date
, Status
, Content-Type
.
Details¶
There are two types of folders in DEiC’s sciencedata.dk that are personal and shared folders and both requires authentication with credentials.
The path to the shared folders where the files are located must be specified with the path
argument.
However, for personal folders is the file
argument that includes the path information.
That is, an [R] code will be like
# personal folders
R> request("path/file")
# shared folders
R> request("file", path="/path")
Many times, DEiC’s sciencedata.dk places the data on a subdomain, and
for some request methods like PUT
it is needed to specify the subdomain as well.
Authentification¶
In case that accessing the server requires basic authentification,
then package "tcltk"
may be needed as well to input the credentials with a widget prompt.
request()
has the cred
argument for performing a basic authentification.
In DEiC’s sciencedata.dk, both personal and shared folders need some sort of authentication. With the basic authentication, the credentials are given with the username and password used under your personal ‘sciencedata.dk’ settings.
Hint
It is possible to prevent the widget by recording this information in a vector object. If you want to avoid a dialog box then save your credentials.
# save authentication credentials R> mycred <- c("YOUR-AUID@au.dk", "YOURPASSWORD")However, in many cases such as with public folders in sciencedata.dk authentification is not needed and you can disable it by setting anonymous to
TRUE
.
Responses¶
Server responses carry a code called HTTP status code where 2xx
means success,
and 4xx
means client error. There is also a status code like 5xx
for server
error, and 3xx
for redirection (and where codes 1xx
are just informative).
Todo
Typical staus codes in the response are
404
,201
,307
…
When using the request()
function, the HTTP status code is given under Status
in the
response message below the time stamp.
Examples¶
Some examples of HTTP requests are given next where reponse messages in some cases are given
afterwards, and recall that request()
requires the httr
package.
# load required package
R> require("httr") # https://cran.r-project.org/package=httr
Method GET¶
This method is for accessing the files with the data.
# for personal data (in case you have this file)
R> request("df.json", cred=mycred)
#[1] {"a":{"0":"a1","1":"a2"},"b":{"0":"b1","1":"b2"},"c":{"0":"c1","1":"c2"}}
# for shared folders (example Vojtech test folder), where both options work
R> request("df.json", path="/sharingin/648597@au.dk/TEST_shared_folder/", method="GET", cred=mycred)
#[1] {"a":{"0":"a1","1":"a2"},"b":{"0":"b1","1":"b2"},"c":{"0":"c1","1":"c2"}}
Note
If there is any error, then the HTTP status code with the
GET
method is200
or OK but it is not returned.
Method PUT¶
The URL typically includes also a subdomain that for DEiC’s sciencedata.dk
is named silo
followed by a number. For instance, my personal documents are located in silo1.sciencedata.dk
,
and other users that will follow are probably located at silo2
, etc.
PUT
in own folder¶
For method PUT
, the subdomain is mandatory; otherwise the request is redirected.
# for personal data (in my case) I need to specify the subdomain; otherwise it gets redirected!
R> request(system.file("CITATION"), method="PUT", cred=mycred)
# Response [https://sciencedata.dk/files/CITATION]
# Date: 2020-01-17 13:31
# Status: 307
# Content-Type: text/html; charset=UTF-8
#<EMPTY BODY>
The HTTP status code 307
means temporary redirect.
# my data is in subdomain "silo1"
R> request(system.file("CITATION"), method="PUT", cred=mycred, subdomain="silo1")
# Response [https://silo1.sciencedata.dk/files/CITATION]
# Date: 2020-01-17 13:31
# Status: 201
# Content-Type: text/html; charset=UTF-8
#<EMPTY BODY>
The HTTP status code 201
means that the file was created in the server side.
PUT
in a sharing folder¶
# (example Vojtech test folder)
R> request(system.file("CITATION"), path="sharingin/648597@au.dk/TEST_shared_folder",
+ method="PUT", cred=mycred)
# Response [https://sciencedata.dk/sharingin/648597@au.dk/TEST_shared_folder/CITATION]
# Date: 2020-01-17 13:34
# Status: 307
# Content-Type: text/html; charset=UTF-8
#<EMPTY BODY>
R> request(system.file("CITATION"), path="sharingout/648597@au.dk/TEST_shared_folder",
+ method="PUT", cred=mycred)
#Response [https://sciencedata.dk/sharingout/648597%40au.dk/TEST_shared_folder//CITATION]
# Date: 2020-02-10 09:32
# Status: 201
# Content-Type: text/html; charset=UTF-8
#<EMPTY BODY>
Hence, the PUT
method for a shared folder needs 'sharingout'
in the path;
otherwise it gets redirected.
Note
In some cases, the metacharacter
@
in the path is “escaped” as%40
.
Method DELETE¶
In the case of accesing with a request using methods GET
or PUT
,
the path in the url is followed by sharingin/USERID/FOLDERNAME
,
and for DELETE
the response is given with sharingout
in the path.
# for personal folder
R> request("df.json", method="DELETE", cred=mycred)
# In my case, this is in
#[1] "https://silo1.sciencedata.dk/files/df.json"
# for shared folders (example Vojtech test folder)
R> request("CITATION", path="/sharingin/648597@au.dk/TEST_shared_folder/", method="DELETE", cred=mycred)
#[[1]]
#[1] "https://sciencedata.dk/sharingout/648597%40au.dk/TEST_shared_folder/CITATION"
Method POST¶
Finally, there is also the possibility to place files with the POST
method along with extra
information.
R> request(FILE, URL, method="POST")
Typically with a path
argument and subdomain
if required.
Note
Method
POST
is not yet implemented in sciencedata.dk