Showing posts with label R. Show all posts
Showing posts with label R. Show all posts

14 October 2013

How to Build A Climate Diagram in R

Walter & Leith climate diagrams illustrate precipitation and temperature changes throughout the year in one standardized chart.  They are especially useful to determine water stress or other significant climatic factors on plants.  This step-by-step guide teaches how to generate your own Walter & Leith climate diagram using the software called R.  Here are some samples that I made for several places in the Andean highlands following this process:

This climate diagram of Juliaca shows water stress from May to April, and excess water in January and February.  The red line is temperature, measured on the left axis.  The purple line is precipitation, measured on the right axis.  The x-axis is one year, measured in months, from January to December.  Below are the same diagram for four other locations in the Andes.  They show the wide variation in climate found in this bioregion.



R is a command-line based open source software that is hugely flexible for computations and quantitative visuals. This post assumes basic knowledge of R. If you don't know anything about R, see this post for an introduction and links to more information. You will also need basic working knowledge of excel or similar spreadsheet software. I find that OpenOffice's spreadsheet software works even better than Excel and it's open source to boot.

For any more advanced R users reading this post, there are definitely more efficient ways to do this task. I put together this tutorial for a 2-hour introductory workshop with people who had never used the software before, so I chose to make it as simple as possible. Please feel free to do this your way, and to post comments with any recommendations.

Downloads you will need:


In this tutorial we will use the climatol package. Its description and guide can be found here:
http://cran.r-project.org/web/packages/climatol/index.html
You will need to download this package, using R, prior to starting this process.


Part 1: Gathering Data


Before starting, have this data on hand for your location of interest:
  1. Name of the location.
  2. Elevation of the location in meters above sea level.
  3. The range of years that the climate data was collected.
  4. The monthly climate data as per the table below (Abs min t is optional).

Part 2: Preparing the data in a spreadsheet


1. Create a spreadsheet with the following information in the upper left-hand corner of the file (replace the numbers to match your data - this is just an example).


Precip = average precipitation in mm per month
Max temp = maximum average temperature per month in ºC
Min temp = minimum average temperature per month in ºC
Abs min t = daily minimum temperature per month in ºC.

If you do not have the abs min t data, simply copy the minimum temperature data from the column above, but do not leave the cells blank.

2. Delete the first column in excel so that only the data remains. Be sure to delete any extra text and formatting anywhere in the file.

3. Save the file as a .csv

Part 3: Working in R


1. In R, use the Package drop down menu to install the climatol package. You will need an intenet connection to do this.

2. Attach the installed package to your session using this command:

>require(climatol)


3. Set the working directory to the folder where you keep your files using the Misc drop down menu. Verify it with this command, which should return the file path to the directory you selected in the drop down command.

> getwd()

4. Load and assign a name to your data:

> name=read.csv("folder/subfolder/datafile.csv")

5. View your data and check for anything unexpected. Reload as needed.

>name

COMMON ERROR: If you see anything strange here, then go back to step 2 and delete rows and columns that are adjacent to your data (even if they are blank). Be sure to put your cursor in cell A1, which should be the upper left-hand corner of your table, before saving.  You can also try a "paste special" of "values" only into a new spreadsheet and resaving.  If it still isn't normal-looking, then try opening your spreadsheet in Openoffice (it's free) and converting into .csv from there, with no extra formatting.

6. Create the plot (this returns in a new window):

> diagwl(name, est="Location",alt=elevation, per="dates", mlab="en")

COMMON ERROR: If you do not see a window with the graph open up, select the button that looks like a bar graph at the top of the R window, and try the command again, using the up arrow.

7. Adjust the colors if you like, using the color chart pdf:

> diagwl(name,est="Location",alt=elevation,per="dates",mlab="en",pcol="#color1",sfcol="#color2")

These are the color parameter names, with their default colors:

pcol Color pen for precipitation ("#005ac8").
Tcol Color pen for temperature ("#e81800").
Pfcol Fill color for probable frosts ("#79e6e8").
Sfcol Fill color for sure frosts ("#09a0d1").

8. Assign a name to the plot:

> plotname=diagwl(name,est="Location",alt=elevation,per="dates",mlab="en",
pcol="#color1",sfcol="#color2")

9. Generate a plot file of the plot in your working directory as .eps:

For an .eps file, use this string of commands:

> postscript("plotname.eps", horizontal = FALSE, onefile = FALSE, paper = "special", height = 10, width = 10)
> setEPS()
> postscript("plotname.eps")
> diagwl(name,est="Location",alt=elevation,per="dates",mlab="en",
pcol="#color1",sfcol="#color2")
> dev.off()


10. Repeat from step 4 to create any additional diagrams while in the same session.

I originally developed this tutorial as part of a workshop for graduate students in Environmental Science and Policy at Central European University in Budapest, Hungary.





13 October 2013

Introduction to R Software

This introductory post on R software is just enough to get you started so that the other posts on R don't need to repeat the same information at the beginning each time.  It was developed in collaboration with Thomas Pienkowski.

What is R?


R is a free, open source software that uses a command interface to interact with the user.  Outputs include strings of text within the command line as well as graphical outputs in a separate window.  It has some basic commands that are supplemented by packages developed by third parties and individually downloaded and installed, like plugins or apps.   

What is an R Session?
Each session in the R Console starts from the beginning.  Strings of commands are stored in R studio or some other separate program.  A session ends when you close the R window.  This means that each time you restart R, you must re-attach any packages and insert the string of commands that you have worked out thus far.  This may seem like a major pain, but actually much of the time in R is spent working out the commands, not entering them.

Packages:
Packages exist in one of three sequential states in relation to your use of R:
1.     A developed package that is not yet on your computer, but exists
2.     A package that has been installed on your computer and is available to be attached during each session
3.     A package that has been attached to your particular session and is ready for use

Commands:


Commands are typed by you in the command line.  A command line always starts with > and then has a command name followed by parameters enclosed in ( ) and separated by commas.  Names are sometimes enclosed in " ".   

For example:
> plot ( "x", height = 10, width = 10)


Hit "return" to enter a command.  The program will read the command and then do one of four things:
1.     Return requested information below the command line, which may involved performing a task or calculation
2.     Perform a task visualized in another window (usually a graph)
3.     Perform a task, but not return any information in the command line
4.     Return an error message

To go through commands you have previously typed in your session, use the up arrow.
To create your own "uniquename" for a data set (this is called an assignment), or even for a command performed to a data set:
> uniquename = command

Note that R commands are case-sensitive, so X is different from x.  Spaces within the command line do not matter.

Commands can also pull information from a subset of an object or dataset.  To do this, use [ ]  For example, in the case of a matrix (or spreadsheet or table) of data called X, typing: > X[1,2] will return the data in the first row, second column.

Helpful Commands:

To get help on the function of a command X:
> ??X
To return command names that contain X:
> apropos("X")
To see the color options:
> colors()
To see installed packages:
> library()
To install packages, use the dropdown menu called "Packages and Data," and select "Package Installer," or use this command to install package X:
> install.packages("X")
To see a list of the objects in your current session:
> ls()
To view basic information on a dataset X:
>summary(X)

Resources:

Download R to install on your computer.

Good self-learning introduction:

A place to explore packages:

Where to find packages to install:

To find packages that help R interface with other software, such as Google Earth:

PDF color chart:

What is your favorite R resource?  Feel free to share it below.
 

15 September 2012

Factors Influencing ICLEI Membership

What makes an ICLEI member city?

ICLEI – Local Governments for Sustainability, is an international NGO who works with local institutions, primarily municipalities, to support the implementation of sustainable goals.  One of the primary mechanisms that ICLEI uses to implement its programs is through official membership, for which they charge a small yearly fee.  In return, members gain access to grant opportunities, international recognition, publications, workshops, and opportunities to participate in sustainability programs.  The ICLEI website lists 1173 local governments and associated entities as current members, representing 81 countries.  947 of these members are local municipalities (cities, towns, etc.) and the other 226 are city networks, nonprofits, and county or regional governments. With the ICLEI membership being so vital to ICLEI as an organization, and often serving as a primary support vehicle for the implementation of the UN’s Agenda 21, this analysis seeks to determine whether there are correlative properties that serve as country-level indicators for city membership in ICLEI.

Methodology

My analysis utilizes the list of members available here on ICLEI’s website.  I then separated cities, villages, towns, and other small governments from regional or county level governments, networks, NGOs and other such groups.  As a rough indicator of the level of participation of cities in each country, I used the number of city members divided by country population, resulting in numbers from just over 3 per million citizens to zero.  Zero indicates countries that did not have small government members, but only had regional or other types of members instead (see figure below).  The two top scoring countries, the Maldives and Iceland, each have relatively small populations and contain one city that is a member and that also accounts for about a third of their total populations.  Subsequent countries have less extreme ratios of member cities to total population.



I then looked at 4 potential country-level indicators for city membership, including (1) the presence of an ICLEI regional office in the country, (2) Kyoto protocol signatories, (3) GDP per capita, and (4) the GINI index (a common measure of equity defined by the World Bank).  I will explain here why I selected each variable.

(1) The ICLEI website indicates regional offices in South Africa, Canada, the United States, Germany, Japan, Korea, Brazil, Mexico, Australia, India, and the Philippines.  These 11 countries account for 850 memberships – nearly 75% of total memberships (529 of these are in the U.S.)  I hypothesized that having a regional office in country would correlate with higher membership rates due to stronger ICLEI networks in countries with ICLEI staff.  (source: http://www.iclei.org/index.php?id=global-contact-us)
(2) I used signing the Kyoto protocol as a rough indicator of how environmentally mindful the national government was in a particular country (see somewhat out-of-date map below to get an idea).  I did not take into account ratification status, since nearly every country in this analysis has ratified other than Canada and the U.S.  This variable could result in a higher likelihood of membership due to the environmental leanings of the national government being a reflection of the views of citizenry.  Alternatively, however, it is possible that local governments in countries that had not signed the Kyoto protocol would need the use ICLEI’s services more, due to lack of national government support.  This would generate a negative correlation between the two variables.  Either way, I expected this variable to have some effect on membership rates. (source: http://en.wikipedia.org/wiki/List_of_parties_to_the_Kyoto_Protocol)
(3) GDP per capita serves as a measure of country wealth, and often as an indicator of citizen interest in environmental measures, especially due to the impression that sustainable activities reduce economic prosperity.  I expected GDP to correlate positively with membership. (source: http://data.worldbank.org/indicator/NY.GDP.PCAP.CD/countries)
(4) The GINI index measures the disparity between the wealthiest and the poorest in a country.  This can be used as another type of measurement for country prosperity, and I expected it to correlate positively with membership. (source: http://data.worldbank.org/indicator/SI.POV.GINI)

In plain English, I expected that member cities would be more likely in countries that contained a regional ICLEI office, had higher GDP, and higher income equality.  I wasn’t sure whether being Kyoto signatories would encourage or decrease membership rates, but expected there to be some effect.  I did not expect these indicators to correlate to a high degree with membership due to the complexities inherent in making the decision to join ICLEI, but expected to find some predictably of results.

World Map of Kyoto signatories and ratification status. 
Many more countries have since ratified the treaty, including Australia, Turkey, and over 20 others.
(photo source: morriscourse.com)

For the sake of simplicity, I used country-level information, since there were only 81 countries but over 900 cities.  Further investigations could look at city-level indicators in order to find potential greater correlation values and could take into account other variables such as municipality size or budget.

For the actual analysis, I utilized the freeware R, and performed a multiple linear regression on the data.

Results

After performing the multiple linear regression analysis, which uses the computer to perform calculus on the four variables to determine if any of them correlate with ICLEI membership, I found that the most correlative descriptor was GDP/capita.  Even this was not a big predictor, and could account only for a 1.32% increase in memberships per million people for each US$1,000 increase in GDP/capita.  Surprisingly, hosting an ICLEI office and the GINI index were not statistically significant factors.  I ended up dropping the GINI index from the analysis altogether since it was not helping overall accuracy of the results (read: R-squared values decreased).  Signing the Kyoto protocol had a slight negative correlation, but not one significant enough to account for much.

For those who prefer to read the statistics, here are the base results given by R:

Residuals:
     Min                 1Q         Median            3Q             Max
-0.82992    -0.40870    -0.15950     0.08209    2.66220

Coefficients:
                                           Estimate        Std. Error        t value        Pr(>|t|)   
(y-Intercept)                  4.047e-01      1.404e-01         2.882        0.00512
ICLEI office                     2.597e-02      2.373e-01         0.109        0.91315
Kyoto Signatory          -3.164e-01      1.646e-01       -1.922        0.05833
GDP/capita (US$1)      1.326e-05      3.846e-06         3.448        0.00092

Residual standard error: 0.7257 on 77 degrees of freedom
Multiple R-squared: 0.1597
Adjusted R-squared: 0.127
F-statistic: 4.879 on 3 and 77 DF
p-value: 0.003694

In an effort to find stronger correlations, I tried using all memberships rather than just cities, and various variables transformations.  None of these manipulations resulted in any headway on answering the question at hand.

Conclusions

I was so sure that hosting an ICLEI office would have a positive correlation with memberships that I would caution using these results without further analysis.  Purely looking at membership numbers would suggest such a trend, but it may be that higher GDP is in fact a stronger correlation.  These findings suggest that ICLEI memberships are more difficult to predict than I had originally anticipated.  This indicates that while it may be easier to gain memberships in wealthier countries, this is not a strong correlation, and is much weaker than might have been realized.  Thus far, it does not appear that there is a shortcut that can aid in gaining memberships more quickly.  This may also indicate that the globe is indeed pulling together, at least in cities, to work on global issues, and is not as divided as the Kyoto protocol or income inequalities might indicate.

Future analyses may want to pursue energy sources or climate impact as possible drivers for membership as well.  Please send me your comments on what other options could be explored, or to request the raw data.