Friday, 13 January 2017

Diagram templates in GCDkit

Happy new year to all ! For this first post of 2017, let's dig in some techincal details on graphs in GCDkit, for those of you who would like to develop their own templates. Interested? Read on....

One of the beauties of GCDkit is the possibility to access many predefined diagrams of common use. They reside in the "plots/classification" or "plots/geotectonic" menus. The difference between the two types of diagrams is subtle; diagrams from "classification" can be used for to classify samples according to where they fall in the diagram, whereas those from the "geotectonic" group cannot be used in this way. In some cases it is rather obvious (some of the geotectonic diagrams actually include multiple panels, so the classification would be a bit more difficult). In other cases the difference is just a matter of how the function is written internally, and converting a geotectonic diagram to a classification diagram would be easy.

Diagram templates

Let's start by having a look at how diagrams are defined in GCDkit. You may have noticed that, when you load data, GCDkit prints a list of diagrams that can be used – i.e. the program reads diagrams from files on the disk, and checks for each of them whether the required data exists for this dataset (technically, if WR[] has the appropriate column headings). So adding one diagram template is actually a simple matter of writing a template file.

Template files reside (on a windows 7 install) in C:\Program Files\R\R-3.2.1\library\GCDkit\Diagrams\ . If you navigate there (or in the …\library\GCDkit directory in your R installation, if you did put R somewhere else, for some reason), you will notice 4 "spider" something files (which we are not going to consider for now); a classification directory; a geotectonic directory. The "classification" directory has 3 sub-directories for the 3 languages now supported. Each of the 3 language directories contains the same files, with just the names being translated. In fact the master directory is the English one, and the other two are automatically generated from a dictionary file (.dict.txt) during compilation – not something you'll have access to, so for the time being better stick to English.

Let's look at a diagram template in … \library\GCDkit\Diagrams\Classification\English . For instance a simple one, a Shand diagram. The corresponding file, Shand.r, can be opened in a text editor of your choice (notepad is by default on your computer but if you are serious about programming, you should look for one of the myriad of more clever alternatives – notepad++, ultraedit, winedt, tinn-R…). Shand.r is a 28-lines file, that we will now examine.

The first line reads

So we are defining a new function, named Shand(). The function takes no arguments. The curly brace { closes at the very end of the file, line 28, so all of the file is enclosed in the function.

Next are two lines:<<-WR[,"A/CNK"]<<-WR[,"A/NK"]
Here the code defines two variables (vectors) called and; they are a subset of the matrix WR, specifically gets the content of the column called "A/CNK" and gets "A/NK". Both are calculated by GCDkit from your major elements data when loading a file, so you can trust they'll always be there – otherwise you would have to do the calculation yourself. Note that we use here a "super assignment", <<- , to make sure the variables are accessible even outside of the function (ok, you do not really need to understand that if you are just up to hacking some diagram templates).

Then comes the core of the graph function, the template. The template will be used by GCDkit's internal plotting system called "Figaro", which is a rather complex affair that allows interactions with the graphs. You do not really need to understand the specifics of it; all you need to know is that you have to supply an object called temp (for template, not temporary!) that will be a list containing all the stuff you'd like Figaro to plot on your diagram (in addition to the datapoints): lines, labels, etc.

Here, the template is assembled lines 21 to 25, with the following code:
Which looks at the value of the (Boolean) option gcd.plot.text, and accordingly will either merge temp1 and temp2, or use only temp1, to create the list temp. temp2, as you may have guessed, contains the text labels as we shall see latter on.

First, we define the lines with
GCDkit=list("NULL",plot.type="binary",plot.position=14,"A/CNK - A/NK (Shand 1943)")
temp1 is a list of "things" to plot on the graph; each item in the list is a list itself. Here we define the following:
  • clssf, which contains the names to use for classification purposes. Here we indicate that the items to use for classification will be number 2 to 5 (number 1 is clssf itself), and the names to be used are, in the same order, "metaluminous"… etc.
  • lines1 to lines4 are, as the first item of the list tell us, "lines". Their coordinates are given as a series of x values, then y values. So the first item, lines1, goes from (1,1) to (1,7) to (0.5,7) to (0.5,7) back to (1,1). Note that since it will be used for classification purpose, we must close it and return to the origin – hence the first and last points are the same. Also note that these lines have neither color nor pattern, so they will be plotted but invisible (you may, however, encounter them if you edit the resulting graphical file in Illustrator or similar, as you will get these surprising invisible polygons – yes, that's why!
  • lines11 and lines12 are "abline", the first one is horizontal at y=1 (h=1), the second one is vertical. Both are plotted in colour #2 (which has been defined somewhere else).
  • lines13 is a dashed line, defined as a "lines" Figaro object with x and y values.
  • GCDkit is a list of type "NULL" (it's not a type known to Figaro), but contains the info GCDkit needs to add it to the menu: the name under which it will appear ( = ) and the position it will assume in the list of graphs (plot.position=14).
Temp2 contains more template info:
As you can see, these are the text labels to draw. All are of type "text", and come with a plotting position, a text, colour, and adj (how the text will be "aligned" around its nominal position).

Both are then merged into temp, as we discussed; and the last thing to do is to call Figaro with
sheet<<-list(demo=list(fun="plot", call=list(xlim=c(0.5,1.5),ylim=c(0,7),main=annotate("A/CNK-A/NK plot (Shand 1943)"),col="green",bg="transparent",fg="black",xlab="A/CNK",ylab="A/NK",main=""),template=temp))
(or in fact, to put all the info in a figaro object, that is actually drawn when calling the function plotDiagram, which is what the GUI does).

The Figaro object is one more list of lists, which you use to pass the graph template (template=temp, at the end) as well as parameters such as the plot limits (xlim, ylim), the title of the graph (main) and of the axes (xlab, ylab), etc.

The function that has just been defined (called Shand(), as you may remember) does nothing visible on its own. But it can be used as an argument to more interesting things such as
plotDiagram(diagram="Shand") # asks lots of questions, and eventually plots the diagram
classify(diagram="Shand") # writes the classification of each sample
and these are the functions used in the GUI.

A non-classifying diagram is very similar – except that its template has no clssf object; and this allows to have open fields (remember how here the classification fields all have to loop back to the (x,y) of the first point…)

Our own new diagram

La/Yb vs. Yb diagram (here from Moyen and Martin 2012

So – we can now use our newly acquired knowledge to define a new diagram template. Let's go for my old friend – Martin's La/Yb vs. Yb diagram for TTGs vs non-TTGs. As you know this diagram has been seriously misused and I suggest you read, for instance, Moyen (2009) and Moyen & Martin (2012) before using it, but let's consider for the time being that you know what you are doing, right ?

As shown here, this diagram has a closed blue "TTG" field to the left, but in other work it is actually left open to the top. Closing it does allow to use this diagram as a classification diagram, so let's keep it like that.

Let's create a new R file. It starts with:
LaYb<-function() # The name of the function. {<<-WR[,"Yb"]/0.22<<-(WR[,"La"]/0.33)/(WR[,"Yb"]/0.22)

We are using normalized values here, and not ppm. Also, we need to calculate a ratio (La/Yb)N, that is not available by default in WR, so we do the calculation ourselves.

Now for the template :

only two classification items – if we are somewhere else, "classify" will return the value "unclassified".

The two polygons (TTG for the first one, ordinary calc-alkaline for the second one). The most time consuming part is actually extracting coordinated from the diagram. Here I did it by reading the cursor coordinates in Illustrator… Also, we need not forget to define the colour (colour number 2 here, the common GCDkit brown by default) otherwise the polygons will be invisible.

GCDkit=list("NULL",plot.type="binary",plot.position=100,"La/Yb vs. Yb (Martin, 1986)")
Position in the list of available GCD graphs. I put it at 100 to be (reasonably) sure it does not interfere with anything else. If I define more diagrams, I'll make them 101, 102, etc.

By the way don't forget to close the parenthesis here (see the final file).

The text annotations. A bit of trial and error is required to get good-looking (x,y) coordinates.

This stays the same as in Sand's example before, I like the idea of a global option defining whether to annotate the diagram or not (plus, I think the un-annotated versions are used in plates).
                                          main=annotate("La/Yb vs. Yb (Martin, 1986)"),
And the actual Figaro call (here reformatted because I was getting lost in all the parentheses, now you can see what's going on !). Note the use of annotate and square brackets to get subscripts.

Again, a closing curly brace is required here to close the function.

Using it !

Now, you can test your diagram. The easiest is to "manually" load the function, maybe by "sourcing" its code (from file/source R code in the GUI, or by using source("path\\to\\file\\LaYb.r") from the command line) or even by copy/paste to the console. Of course your diagram will not show in the menus (it has not been loaded yet).

But you can still get it (and, presumably, debug it !) with
and the classification
(we need silent=TRUE otherwise GCDkit tries to look into the list of known diagrams to retrieve its name – which, of course, fails miserably since the diagram has not been properly loaded yet).

The "classification" menu after adding the diagram in the right directory

When you're happy with your diagram, it's time to drop it in the … \Diagrams\Classification\English folder, and load some data to see what happens (until you load data, nothing will happen, as GCDkit only searches for diagrams when loading data !).

The diagram in GCDkit. Yes, some rocks are not adakites. Even if they have high La/Yb.

Enjoy your new diagram !

The whole code in a single R file: click here.

Monday, 7 November 2016

Catania Workshop

We held a workshop in Catania, sicily, from the 26th to the 30th of September of this year. It was convened, and beautifully organized, by P. Fiannacca and R. Cirrincione at Università degli Studi di Catania. With 20+ participants, including a sizeable group from Brazil, we think the workshop was quite a success (better ask the participants, though !).

Mt Etna, viewed from the hotel where we stayed (Jeff's balcony, to be specific)...
For the first time, we introduced during the workshop the set of modelling routines that we've been promising for years. Although there is no GUI to them now, they work nicely in command line. The next step is to actually release them... We also chanced upon a fantastic R package called gWidgets2, that may well take care of most of our GUI requirments (and, in the process, provide a way forwards for a modelling GUI).
The organizers of the workshop: Patrizia Fiannacca, Jeff, Vojta, Rosolino Cirrincione.

The real highlight of the week, however, was clearly not the code but the Wednesday trip to Mount Etna. Despite a somewhat cloudy weather, we enjoyed a great trip to the Valle del Bove, under the wise guidance of C. Ferlito.
Descending into the Valle del Bove

Listening to Carmelo !

Monday, 26 September 2016

Hello Catania !

Good morning to all,

If you're sitting in the Catania workshop and are curious whether the adress Vojta just mentions is actually live -- where, yes it is !
Happy GCDkit-ing...

Thursday, 14 July 2016

What have we been working on?

Mikulov - a lovely town in southern Moravia

Sampling local production in Mikulov

GCDkit is a permanent work in progress, and lots of things evolve! In May 2016, we (Vojtech and Jeff) met for a week in Moravia, locked ourselves in a room and coded from morning to evening. A pity considering the touristic resources, and the view from our room (rest assured that we did sample some of the better options however), but a huge success in terms of writing code!

So, here is a quick list of what we've been up to: you may regard it as a list of features that will be included in GCDkit, sooner or latter (to be honest, latter is more likely than sooner).

Interfaces and connections

  • We spent some effort towards making GCDkit truly platform independent. As we all know, GCDkit so far is a purely Windows-specific application; this is largely (only) due to the choice of using Windows-specific menu and interface functions (the menus that appear at the top of the R console). With help and incentive from some correspondents (Jesse Reimink, then at U. Alberta and now at DTM), we worked hard to make GCDkit usable under other systems. To some point, this can now be done. Jesse had GCDkit running under MacOS, and we had it functional in Linux (well, in an emulator). Of course, you should not expect it to be perfectly stable, and probably not 100% functional either. And obviously the menu system is missing; you have to run it purely from command line. Still, it works and it may be included in the distributed version in the not-so-far future.

GCDkit Development in Linux (Ubuntu)

  • As soon as GCDkit will become (mostly) platform independent and (mostly) usable without the Windows R GUI, nothing should theoretically stop us (or you) from developing other user interfaces. We explored the possibility, for instance, to run a Python application that would call R/GCDkit. Don't hold your breath, though.
  • An additional benefit of a Python/R connection is that QGIS -- the main open source alternative for GIS -- supports Python plugins. There is therefore a long-term possibility for calling a fully interactive GCDkit session from QGIS, and therefore do GCDkit processing on geographic data.
  • A simpler aternative is to use QGIS processing toolbox. As a proof of concept we managed, for instance, to load a shapefile (a geographic layer containing points with analyses) into GCDkit, calculate a CIPW norm, and return another geographic layer containing the normative values that you may then use in normal GIS way (for instance colour by proportion of a normative mineral, etc).

Running a GCDkit "geo-algorithm" from QGIS.


One of our main goals was to work on the long overdue modelling code. After several false starts, we had something to show during the last workshop in Krakow. This time, we totally overhauled the code, and now wrote something we believe is a rather flexible and powerful framework. From a user perspective, you'll be able to do something like that (don't try it on your system, it is not distributed, and it is very likely to change anyway!):

######## Direct melting of an avg Archaean basalt

# Define the mineral proportions
min.props.mod1<-c(0.6,0.4) # sum must be = 1

# Calculate the model
mod1<-pmDirectBatch(c0.lab="avg Arch basalt",min.prop=min.props.mod1,ff=0.7,"mins",mjr.set=mjrs,"kds",trc.set=trcs)

# Add more info (calculating every 0.1 here)
petroCalc(mod1,0.1) # Calculate every 0.1

# Extract the data for plotting

# Prepare a plot (just the real data)

# Add the model curve
Under the hood, we have defined a "PetroModel" class that stores all the information on a model, and several functions to calculate it. With such an object-oriented approach, the nice thing is that we can very easily expand the possibilities. Since we defined the "overplot" function for the main class for instance, we need no additional work to plot a model curve on a graph -- and this will remain true for direct or reverse models, or even for more complex ones including accessories saturation, variable D values, etc., should we decide to include these at some point.

There is no user interface so far, and no real plans to develop one. At the moment one has to resort to the command line and type the functions in, but it is not so difficult. Eventually, we will think of something perhaps using Python (cf. above), or Shiny.

Again, this is not released. It is our next main goal, but will take a bit more time to polish and package (and probably publish !).

Phase equilibria

During the Prague Goldschmidt conference last year, we presented an abstract demonstrating how we could read a PERPLE_X output and further process it to calculate trace elements compositions in the melt. Actually, this is a long standing idea (I [Jeff] started working on it using a monster of an Excel spreadsheet back in 2005, and we started writing this code during a chilly spring in Southern France in 2013). We did spend some work on that, too, tweaking and fixing the code once more (this nasty monazite saturation routine...), and thinking about a publication -- you're not likely to see this code soon and certainly not before we publish.

And more...

Well, we also spent some time discussing grant applications for funding our efforts, and celebrating the success of "ouR" book... We also considered plans to link our work with the very nice job done by Matt Mayne on something called RCrust. And, indeed, we enjoyed a lovely walk in the woods around Valtice.

Book Review

ou"R" book has been reviewed by H. Rollinson, who wrote this essentially nice piece in "Elements". We are very grateful for the positive feedback !

Tuesday, 12 July 2016

Troubles with Windows 10

We have been recently flooded by  email reports of those of you who could not start the GCDkit installer on Windows 10. Strangely, the other versions of the system seem unaffected. We have examined the problem and, as it seems, it consists in the sudden change in security policy of Microsoft, requiring all installation programs to have a valid security certificate.

While we are working on the remedy, the only way on Win 10 seems to use the "advanced" setup taking advantage of the R library mechanism.  
There is no need to be afraid of this,  The procedure is as follows:
  • download and install the R 3.2.1 for Windows
  • download the library
  • run the R console in a SDI mode
  • install the library from the R console (Menu Packages > Install package(s) from local zip files...)
  • install the R2HTML, RODBC and sp libraries (Menu Packages > Install package(s)...) [working Internet connection is required]
  • load the GCDkit library (Menu Packages > Load package > GCDkit or type library(GCDkit) into the R Console)
The last step is to be repeated every time you start R GUI a new. 

Alternatively, you can make an shortcut at your Desktop with a text similar to mine (do take care of the correct path):

C:\R\R-3.2.1\bin\i386\Rgui.exe --silent --no-save --sdi

Setting the directory to that with R libraries. In my case it is:


I hope that this will help. Please let us know if there are any further problems. 


Tuesday, 23 February 2016

What is new in GCDkit 4.1 (Chlopskie jadlo)?

On 10 February has been released a new version of GCDkit. Apart from numerous bug fixes, and a great deal of editing to enable running the GCDkit scripts in a batch mode, it brings several new features. Arguably the most interesting include: 
Switching between datasets
  • Datasets can be stored using the function pokeDataset. Or it is done automatically, like for data freshly loaded into the system using the functions loadData and accessVar. Technically they always become components of a list WRCube.
  • Later, the previous datasets can be restored by functions peekDataset or selectDataset.
  • The WRCube can be cleared anytime using the function purgeDatasets; only the most recent copy of the current dataset survives this operation.
  • Function selectAll has been completely rewritten using the dataset switching, i.e. backup copy of data  stored in the WRCube. Variables WR.bak,WRanh.bak,milli.bak and labels.bak are thus not needed anymore.
  • New function figOverplot allows overplotting a reference dataset onto any Figaro-compatible binary plots, ternary plots, or spiderplots. The reference dataset, stored in an arbitrary global variable, can contain either real-world data or a numeric matrix  spanning from petrogenetic modelling.
Futher important changes
  • figCex, plateCex accept now the value NA (no quotes) that means no plotting  of symbols (most useful on spiderplots if only lines are desired).
  • figAddReservoirs checked and reprogrammed; it newly does work for ternary plots,  and is Figaro-compatible for binaries, ternaries and spiders. It also includes a lot of new arguments, mainly plotting parameters.
  • New functions oxide2ppm and ppm2oxide to recast wt.% of the given oxide to ppm of cations
  • Platform-independent functions tk_winDialog and tk_winDialogString written in Tcl/Tk.
  • calcCore does not require global variables anymore, but can use also those from the parental environment.
  • accessVar does not utilize Windows clipboard to transfer data into variables WR and labels anymore

We trust that you will find this new release useful and reasonably stable.

Good luck, Vojtech