This so-called R Markdown file accompanies a 3-hour Reproducible Open Coding Kit (ROCK) workshop developed by Szilvia Zörgő & Gjalt-Jorn Peters. More details are available below in section Links and resources.
The easiest way to get started is to copy this file to a Posit Cloud project of your own. To do that, first visit the shared Posit Cloud project at https://posit.cloud/content/6434221. Note that to view it, you will need to be logged in with a Posit Cloud account, so create that first if you don’t have one yet.
Once it has loaded, click “Save a permanent copy” at the top:
This will store the project in your account’s workspace, so you that your changes are preserved and you can always return to it. If you do not save a permanent copy, you will be ejected from the temporary project after a while and will have to start over.
If you are already familiar with R, RStudio, and Git, you can also download this project and use your local RStudio Desktop installation. For the URL to the Git repository, see the Appendix.
The script below contains R commands (in the gray sections called “chunks”), which can be run individually by pressing the green “play button” in the chunk’s upper right corner. Note, you will only see this option if you open the script in posit/RStudio.
Run this chunk every time you start a session!
The chunk below will install all R packages needed to run the commands in the script. It also contains default options for {rock} and paths to subdirectories. Run it by clicking on the green play button in the top right corner of the chunk.
### package installs and updates
packagesToCheck <- c("rock", "here", "knitr", "writexl");
for (currentPkg in packagesToCheck) {
if (!requireNamespace(currentPkg, quietly = TRUE)) {
install.packages(currentPkg, repos="http://cran.rstudio.com");
}
}
knitr::opts_chunk$set(
echo = TRUE,
comment = ""
);
rock::opts$set(
silent = TRUE,
idRegexes = list(
cid = "\\[\\[cid[=:]([a-zA-Z][a-zA-Z0-9_]*)\\]\\]",
coderId = "\\[\\[coderid[=:]([a-zA-Z][a-zA-Z0-9_]*)\\]\\]"
),
persistentIds = c("cid", "coderId")
);
### Set paths for later
basePath <- here::here();
dataPath <- file.path(basePath, "data");
scriptsPath <- file.path(basePath, "scripts");
resultsPath <- file.path(basePath, "results");
Three plain text files containing data (i.e., “sources”) have been
placed into the “010---raw-sources
” subdirectory located
within the data directory. Also, there are also some attributes of the
mock data providers listed in the file called
“attributes.rock
”.
The cleaning command places each of the sentences in your data on a
new line. The {rock}
package enables you to code data
line-by-line, and recognizes newline characters as indicators of this,
lowest level of segmentation. The chunk below will write the cleaned
sources found in “010---raw-sources
” into the subdirectory
“020---cleaned-sources
”.
rock::clean_sources(
input = file.path(dataPath, "010---raw-sources"),
output = file.path(dataPath, "020---cleaned-sources")
);
If it makes sense for your project, you may choose to add a unique
identifier to each line of data (i.e., “utterances”). This is helpful,
for example, if you want to merge different versions of the coded
sources into a source that contains all codes applied by multiple
researchers. The chunk below will write the sources with uids into the
subdirectory “030---sources-with-uids
”.
rock::prepend_ids_to_sources(
input = file.path(dataPath, "020---cleaned-sources"),
output = file.path(dataPath, "030---sources-with-uids")
);
Please visit the rudimentary graphical user interface, iROCK (available at https://i.rock.science). This interface allows you to upload your sources, as well as codes and section breaks (for higher levels of segmentation), then drag and drop those into the data.
Click the ‘Sources’ button at the top to load a source. It will show you a dialogue similar to that shown in Figure 3. To load the example source, copy-paste the following URL into the field as shown in Figure 3 and press [ENTER].
Then repeat that to load the example codes and section breaks, this time copy-pasting these two URLs:
Example deductive codes to use: https://rock.science/workshop/3hr/codes
Example breaks to use: https://rock.science/workshop/3hr/breaks
When you loaded all three the files into the right place, you should see something similar to what is shown in Figure 4:
You can now start coding and segmenting. To use one of the codes or section breaks you loaded, drag them from the right-hand panel and drop them where you want them in the source. If you make a mistake, simply click the section break or code to delete it again.
When you are done coding, you can download the coded source by clicking Download. Normally, it is vital to not forget that, but in this workshop, you will be working with pre-added coded sources.
Run this chunk every session during which you want to employ the functionality below (e.g., inspecting fragments, code frequencies, heatmaps)!
This command will assemble all your coded sources and attributes into an R object that can be employed to run analyses and other commands below. Note, coded sources and attributes have been pre-added for your convenience.
dat <-
rock::parse_sources(
dataPath,
regex = "_coded|attributes"
);
This command allows you to collect and inspect coded fragments for
certain codes, you can use the command below by changing the code labels
“CodeA
” and “CodeB
” to the codes you’d like to
inspect. You can modify the amount of context you wish to have around
the coded utterance by changing “2” to any other number.
rock::inspect_coded_sources(
path = here::here("data", "040---coded-sources"),
fragments_args = list(
codes = "CodeA|CodeB",
context = 2
)
);
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
002_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
001_Source_cleaned_withUIDs_coded.rock
Source:
002_Source_cleaned_withUIDs_coded.rock
Source:
002_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
Source:
003_Source_cleaned_withUIDs_coded.rock
With this command, the {rock}
package creates a code
tree, which can be flat or hierarchical depending on the employed codes.
In this workshop, we use a flat code structure.
rock::show_fullyMergedCodeTrees(dat)
This command will allow you to see a bar chart of the code frequencies within the various sources they were applied. The command also produces a legend at the bottom of the visual to help identify the sources based on color.
rock::code_freq_hist(
dat
);
Code co-occurrences can be visualized with a heatmap. This representation will use colors to indicate the code co-occurrence frequencies. Co-occurrences are defined as two or more codes occurring on the same line of data (utterance).
rock::create_cooccurrence_matrix(
dat,
plotHeatmap = TRUE
);
CodeA CodeB CodeC CodeD
CodeA 6 1 2 1
CodeB 1 8 1 3
CodeC 2 1 4 0
CodeD 1 3 0 11
This command will enable a tabularized version of your dataset, which for example, can be employed to further process your data with software such as Epistemic Network Analysis (https://www.epistemicnetwork.org), or “merely” represent your coded data in a single file. In this dataset, rows are constituted by utterances, columns by variables and data. The file will be an Excel called “mergedSourceDf” located in the results subdirectory.
Beware, when re-generating the qualitative data table the {rock} default is to prevent overwriting, so either allow overwrite within the script, or delete the old excel before you run this chunk. (The Posit Cloud version of this script allows overwriting.)
rock::export_mergedSourceDf_to_xlsx(
dat,
file.path(resultsPath,
"mergedSourceDf.xlsx")
)
Warning in export_mergedSourceDf_to_file(x = x, file = file, exportArgs =
exportArgs, : The file you specified to export to
(C:/pC/git/quarry/rock-workshop-3hr/results/mergedSourceDf.xlsx) already
exists, and `preventOverwriting` is set to `TRUE`, so I'm not writing to disk.
To override this, pass `preventOverwriting=FALSE`.
If multiple coders are applying different codes or coding schemes to the same dataset, or if a single coder is applying different codes in different rounds of coding, then merging coded sources may be useful. Merging means that you combine different coded versions of the same source into a “master” source that contains all applied codes. Merging is made possible via unique utterance identifiers (uids).
Some pre-coded versions of the data have been added to the subdirectory “041—coded-sources-for-merging”. A good practice is to create a “slug” for each coded version of the sources, for example, “_coder1” and “_coder2”, which you will see for the mock data. You need to choose a version of the coded source to be the foundation upon which the other versions are merged (indicated by “primarySourcesRegex” in the code below). For example, the command below says that all versions of each source should be “collapsed” onto the version with the slug: “_coder1”. The command below will write the merged sources into the same directory as where it found them, resulting in a merged version for each source that you placed into that directory.
rock::merge_sources(
input = here::here(
"data",
"041---coded-sources-for-merging"
),
output = "same",
primarySourcesPath = here::here(
"data",
"041---coded-sources-for-merging"
),
primarySourcesRegex = "_coder1\\.rock"
);
This R Markdown file (see R Markdown section below) can be adapted by workshop participants to their own needs. The authors have deposited this file in the public domain: we waive all (copy)rights.
This file is available in a public Codeberg repository at https://codeberg.org/quarry/rock-workshop-3hr.git and can be downloaded from there. A rendered version of this file is available at https://quarry.opens.science/rock-workshop-3hr.
The slides are available here. More background information is available in the ROCK book at https://rockbook.org.
rock
packageTo command the rock
package (or use other R
functionality), you usually use functions. Functions are small
programs that do things for you. For them to know what to do, you have
to pass so-called arguments or parameters when you
call them. If you get everything right, the function will do its job and
return its result to you. You will usually want to store that result, so
you can do other things with it.
To illustrate this, let us create a simple source using a function. The following command creates a character vector with two elements:
firstSourceBit <-
c("this is the first element",
"this is the second element");
We now called a function called c()
to combine two
elements into a vector (a list of elements). We pass two arguments to
this function (the two text strings), and the functions returns the
result to us (the vector), which we store in a variable called
firstSourceBit
with the assignment operator,
<-
.
If you are viewing the source code of this R Markdown file in RStudio (either Desktop, on your PC, or Posit Cloud, in a web browser), you can select the three lines with R commands above and copy-paste them into the console in the bottom-left corner to try it out. If instead you are reading the rendered version of this R Markdown file, why not use the link above to open the associated project in Posit Cloud so you can play along?
We can check that this worked by telling R to display the contents of
the firstSourceBit
object, simply by specifying its name in
the console:
firstSourceBit
R then shows its contents, and should show:
[1] "this is the first element" "this is the second element"
We can now use another function to combine these two elements into a
single character value again, using a so-called ‘newline character’,
\n
, as separator. The R function to paste several character
strings together is called paste()
. You usually call it the
same way we called the c()
function above, by specifying
all character strings as separate arguments, but we can also pass them
all in a vector: then we pass another argument called
collapse
to tell paste()
the separator it
should use when collapsing the vector into a single string:
firstSourceBit_collapsed <-
paste(
firstSourceBit,
collapse = "\n"
);
If you let R print the contents of this new object
firstSourceBit_collapsed
, you see:
[1] "this is the first element\nthis is the second element"
Here, R shows the newline character as a newline character
(n
), instead of as the newline it represents. To force R to
display the newline character as a new line, use the cat()
command:
cat(firstSourceBit_collapsed);
Which should show:
this is the first element
this is the second element
You now succesfully used your first three functions. Below, as we use
the rock
package, we will use more functions, usually with
more arguments.
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be
generated that includes both content as well as the output of any
embedded R code chunks within the document. R chunks always start with a
line containing three backticks (`) and two accolades ({ and }), with
the chunk’s language (usually r
), an optional chunk label,
and the chunk options in between the accolades. When knitting the R
Markdown document, the R chunks are executed and the results are
inserted into the final rendered HTML (or PDF, or Word) file.
rock
packageIf you want to use the rock
R package on your own
computer, you will first have to download and install it in R. The
following R chunk contains commands you can use to install the
rock
package.
### Note the `eval=FALSE` chunk option on the line above; this tells the `knitr`
### package to *not* execute the R code in this chunk. This has been added
### because you normally will not want to reinstall that package *every time*
### you run this script.
### To install the version on R's CRAN repository network, use:
install.packages('rock');
### The next two commands require the `remotes` package to be installed;
### if you don't have that yet, you can install it with:
install.packages('remotes');
### To install the latest version of the package (at your own risk), use:
remotes::install_gitlab("r-packages/rock");
### To install the cutting edge version (at even more of your own risk), use:
remotes::install_gitlab("r-packages/rock@dev");
### Note that because the `install_gitlab()` function comes from a package,
### we tell R from which package to get it using the `::` operator.
For more on ROCK terminology, see: https://sci-ops.gitlab.io/rockbook/vocab.html.
The Reproducible Open Coding Kit (ROCK) standard is licensed under CC0 1.0 Universal. The {rock} R package is licensed under a GNU General Public License; for more see: https://rock.science.
ROCK citation: Gjalt-Jorn Ygram Peters and Szilvia Zörgő (2023). rock: Reproducible Open Coding Kit. R package version 0.7.1. https://rock.opens.science
For more on ROCK materials licensing and citation, please see: https://rock.opens.science/authors.html#citation.
Thank you for considering to use ‘rock’ for your qualitative project. If you have any questions or would like to make suggestions on how to improve ‘rock’, feel free to write to: info@rock.science.