Loading...
 
Tiki and PluginR

Tiki and PluginR


a better R workflow on a webserver

posts: 68 France

We see that R uses a lot of power, especially to do the :

s4r < - read.delim(quote="", file="./stats_4_R.txt", header=T, sep="|")

For each single graph, R recreates the s4r object. Is there a way to make it persistant so we can just use the s4r object directly in our wikipages?
https://wikispiral.org

If R opens a new session for each graph (what is more or less OK), can't it define some objects for all the sessions, for instance via a bash script?

Is R running as an app or as a aerver? Do we need a special parameter to run R as a server? Does R when he closes empty all its memory? Can't it keep some data in memory for future use?

Sorry for all this questions but 12sec to load three simple pies is a bit harsh...

Have a nice day,

Joël

posts: 1811 Catalan Countries

Hi Joël:

Regarding this question:

s4r < - read.delim(quote="", file="./stats_4_R.txt", header=T, sep="|")


You can use the new version of PluginR (v0.80), released yesterday, which solves this issue.

Please, report any issues, and thanks for reporting this issue a few weeks ago off-list ;-)

From now onwards, everyone is requested to provide or request support through this forum (unless the comment is about security-related issues, if ever something is to be reported, which should be handled as with the rest of potential security questions/warnings, using http://security.tiki.org )


posts: 68 France

Hello Xavi,
We managed to update the plugin but now for relative paths like ../../Theproperdirectory we get :

''Error in setwd("export_stats_files/") : cannot change working directory
Execution halted''

The absolute paths do work. But I really would prefer having relative paths to not show our server tree...


posts: 1811 Catalan Countries

Hi Joël:

Mmm, if you did set your relative path properly, shouldn't you be shown

"Error in setwd("../../export_stats_files/") : cannot change working directory. Execution halted''

instead?

Can you add a few getwd() along your script, to ensure what is the working directory at the begining and at the end of your R call?

As far as I remember, you were only calling your relative path at the beginning of your R script, so seting the wd properly at the beginning should fix it.



posts: 68 France

So now due to the caching system, I have:
First load after RR plugin validation: 16sec
CTRL+F5 load: 4sec
(with a load message .temp/cache/R_Stats_%D0%9C%D0%B0%D0%B9%D0%BD%D0%B0/6d0b8396d6a7b97602e43bbfc3993451_1.png)
F5 load: 3.5sec
What is really better than the previous 12sec I had on a F5 reload.


posts: 68 France

Thanks a lot Xavi!!!

...
But this solves one of the symptoms, but does not answer to my initial question ITT:
Is it possible somehow to give to R server an initial object that all running processes use and that can be automatized via a bash script???


posts: 1811 Catalan Countries

Hi Joël:

What do you mean by:

(with a load message .temp/cache/R_Stats_%D0%9C%D0%B0%D0%B9%D0%BD%D0%B0/6d0b8396d6a7b97602e43bbfc3993451_1.png)


And regarding your last (initial) question:

Is it possible somehow to give to R server an initial object that all running processes use and that can be automatized via a bash script???


After loading the csv data, you would take profit of saving a binary image of your object to disk, so that further reads are way faster.

See this:

R code at the R console in RStudio for instance
# load your csv (read.csv or by any other means) into the object myobject 
save(myobject, file="myobjectondisk.Rda")
rm(myobject)
# Notice that you don't have myobject loaded onto your workspace anymore. 
load(file="myobjectondisk.Rda")


Remember to use, then, the param in PluginR loadandsave="1" (which is enabled by default, by the way, since a few versions, for sake of easy usage of PluginR in most frequent use cases)

Hoep this helps...

posts: 1811 Catalan Countries

Ah, forgot the last part. If you want to automatize the generation of the .Rda file from a bash script, you can create a simple R script that does just that part: load csv, and save Rda on disk.

And you can run that from a bash script with Rscript, etc.

See this nice example here:
http://stackoverflow.com/a/5248006

posts: 68 France

Yes, there is this message while pictures load when, in Firefox, you do CRTL+F5 :
R cache.PNG (10.38 Kb)

Your answer with the binary file is just what we need. Thanks! We'll do some tests and report here the results!


posts: 68 France

For instance, the first load of our global stats page:
https://wikispiral.org/tiki-index.php?page_ref_id=541 takes ... 46 sec (without loading the tests section - in !!- headings).
... And as we will put on this one, say, 25-45 RR plugins, it might take a long long time before anything shows...
Sure, then it only takes 5sec due to the caching system...

But our data will grow daily... So I guess this long waiting might happen more often to our users than we can imagine.

And the pity is that this is mainly due to the fact that at each time we have to recreate the very same s4r object from a (rapidly growing) database of 60000 entries... Wich is really a bad design... So our question: how to gloabally set in R an object that can be reused for all the sessions, or to run R as a server, so that it can keep some vabiables or objects for global use?


posts: 68 France

Some stats :
After using the load(file="s4r.Rda"), (but also qplot instead of generic plotting), I got:
First load after plugin validation: 10sec
CTRL+F load:7sec
F5: 3sec


posts: 1811 Catalan Countries
Nice Joël! (thanks for reporting it :-) )

posts: 37 France

I think I got some generic solution
Th R script is:

create_Rda.R
args < - commandArgs(trailingOnly = TRUE)
dir < - args[1]
input.path < - args[2]
output.path < - args[3]

setwd(dir)
s4r < - read.delim(file=input.path, header=TRUE, sep="|", skip="1", quote="")
save(s4r, file=output.path)


Then it is called by the shell script which crontab launches

#!/bin/bash

STATS_FOLDER="export_stats_files/"
STATS_FILE="stats_4_R.txt" # this is the csv which contains the data in STATS_FOLDER
RDA_FILE="stats_4_R.Rda" # This will be created in STATS_FOLDER
RDA_FILE_TMP="stats_4_R_tmp.Rda" # this is a temporary file to ensure RDA_FILE remains available during processing

/usr/bin/Rscript create_Rda.R "${STATS_FOLDER}" "${STATS_FILE}" "${RDA_FILE_TMP}"

echo -n "Rda file "
diff -q "${STATS_FOLDER}/${RDA_FILE_TMP}" "${STATS_FOLDER}/${RDA_FILE}" > /dev/null  && (echo "OK"; /bin/rm "${STATS_FOLDER}/${RDA_FILE_TMP}" ) || (echo "Replace"; /bin/mv "${STATS_FOLDER}/${RDA_FILE_TMP}" "${STATS_FOLDER}/${RDA_FILE}")