Last updated: 2019-09-19

Checks: 7 0

Knit directory: polymeRID/

This reproducible R Markdown analysis was created with workflowr (version 1.4.0.9001). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20190729) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rprofile
    Ignored:    .Rproj.user/
    Ignored:    analysis/library.bib
    Ignored:    docs/figure/
    Ignored:    fun/
    Ignored:    output/20190810_1538/
    Ignored:    output/20190810_1546/
    Ignored:    output/20190810_1609/
    Ignored:    output/20190813_1044/
    Ignored:    output/logs/
    Ignored:    output/natural/
    Ignored:    output/nnet/
    Ignored:    output/svm/
    Ignored:    output/testRunII/
    Ignored:    output/testRunIII/
    Ignored:    packrat/lib-R/
    Ignored:    packrat/lib-ext/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/BH/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/FactoMineR/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/IDPmisc/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/KernSmooth/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/MASS/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Matrix/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/MatrixModels/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ModelMetrics/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/R6/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RColorBrewer/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RCurl/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rcpp/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppArmadillo/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppEigen/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppGSL/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppZiggurat/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rfast/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rgtsvm/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rmisc/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/SQUAREM/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/SparseM/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/abind/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/askpass/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/assertthat/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/backports/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/base64enc/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/baseline/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/bit/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/bit64/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/bitops/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/boot/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/brew/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/callr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/car/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/carData/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/caret/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/cellranger/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/class/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/cli/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/clipr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/clisymbols/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/cluster/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/codetools/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/colorspace/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/commonmark/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/config/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/cowplot/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/crayon/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/crosstalk/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/curl/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/data.table/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/dendextend/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/desc/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/devtools/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/digest/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/doParallel/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/dplyr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/e1071/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ellipse/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ellipsis/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/evaluate/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/factoextra/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/fansi/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/flashClust/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/forcats/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/foreach/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/foreign/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/fs/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/generics/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/getPass/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggplot2/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggpubr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggrepel/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggsci/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggsignif/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/gh/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/git2r/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/glue/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/gower/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/gridExtra/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/gtable/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/haven/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/hexbin/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/highr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/hms/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/htmltools/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/htmlwidgets/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/httpuv/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/httr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ini/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ipred/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/iterators/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/jsonlite/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/keras/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/kerasR/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/knitr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/labeling/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/later/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lattice/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lava/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lazyeval/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/leaps/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lme4/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lubridate/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/magrittr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/maptools/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/markdown/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/memoise/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/mgcv/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/mime/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/minqa/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/munsell/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/nlme/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/nloptr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/nnet/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/numDeriv/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/openssl/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/openxlsx/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/packrat/tests/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/pbkrtest/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/pillar/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/pkgbuild/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/pkgconfig/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/pkgload/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/plogr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/plotly/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/plyr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/polynom/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/praise/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/prettyunits/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/processx/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/prodlim/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/progress/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/promises/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/prospectr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ps/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/purrr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/quantreg/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/randomForest/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rcmdcheck/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/readr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/readxl/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/recipes/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rematch/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/remotes/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/reshape2/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/reticulate/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rio/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rlang/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rmarkdown/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/roxygen2/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rpart/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rprojroot/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rsconnect/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rstudioapi/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/scales/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/scatterplot3d/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/sessioninfo/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/shiny/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/sourcetools/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/sp/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/stringi/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/stringr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/survival/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/sys/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tensorflow/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/testthat/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tfruns/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tibble/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tidyr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tidyselect/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/timeDate/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tinytex/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/usethis/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/utf8/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/vctrs/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/viridis/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/viridisLite/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/whisker/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/withr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/workflowr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/xfun/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/xml2/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/xopen/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/xtable/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/yaml/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/zeallot/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/zip/
    Ignored:    packrat/src/
    Ignored:    polymeRID.Rproj
    Ignored:    smp/20190812_1723_NNET/files/
    Ignored:    smp/20190812_1723_NNET/plots/
    Ignored:    smp/20190812_1729_NNET/files/
    Ignored:    smp/20190812_1729_NNET/plots/
    Ignored:    smp/20190812_1731_NNET/files/
    Ignored:    smp/20190812_1731_NNET/plots/
    Ignored:    smp/20190812_1733_NNET/files/
    Ignored:    smp/20190812_1733_NNET/plots/
    Ignored:    smp/20190815_1847_FUSION/
    Ignored:    smp/20190905_1602_FUSION/
    Ignored:    smp/20190905_1618_RFRAW/
    Ignored:    smp/20190905_1637_CNND2/
    Ignored:    smp/20190905_1708_FUSION/
    Ignored:    smp/20190910_1805_FUSION/
    Ignored:    website/

Untracked files:
    Untracked:  Rplots.pdf
    Untracked:  analysis/elsevier-harvard.csl

Unstaged changes:
    Modified:   analysis/assets/images/seperators.jpg

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
html 75bc270 goergen95 2019-09-05 Build site.
Rmd a848def goergen95 2019-09-05 changed citation style
html c26a428 goergen95 2019-08-22 Build site.
Rmd 7e9eddd goergen95 2019-08-22 wflow_publish(files = c(“analysis/cnn_crossvalidation.Rmd”, “analysis/cnn_exploration.Rmd”,
html f2ee83c goergen95 2019-08-19 Build site.
Rmd b6fd75e goergen95 2019-08-19 added classification side
html b6fd75e goergen95 2019-08-19 added classification side

Overview

The overall aim of this project was to ease the process of classifying the spectra of particles found in environmental samples. A script was created which takes .txt files in the smp directory and classifies them based on the decision fusion explained here. The script expects one file for each sample. The names of the files need to be unique and will be used as an identifier in the output. It is also expected that the file consists of two columns. The first column indicates the wavenumbers of the sample as a numeric, while the second column contains information about the reflectance value.

Preparing Classification

The head of the classification script contains some variables that can be changed. This is important if extensions were made to the database or a new calibration was to be done. In this case, the directory containing the models can be specified with the MODEL variable. For now, however, the variables hint towards the BASE models which were created during the first calibration. Also, the type of the classification can be specified in the TYPE variable. Currently, the fusion of all four models is chosen, but any single model can be selected. Finally, the FORMAT variable indicates the file extension of the sample data, since it is also possible to provide spectral data in .csv format. At the beginning of each classification process, an entry is created in the smp directory named with the current timestamp as well as the MODEL type which is going to be used.

MODEL = "BASE"
TYPE = "FUSION"
FORMAT = ".txt"

TIME = format(Sys.time(),"%Y%m%d_%H%M")
root = paste0(smp,TIME,"_",TYPE)
plots = paste0(root,"/plots")
raw = paste0(root,"/files")
dir.create(root)
dir.create(plots)
dir.create(raw)
model = paste0(mod,MODEL)

Reading Data

The database and the sample data is read into R. The reflectance values of the sample data are resampled to the spectral resolution of the database and then a baseline correction is applied, using the procedure of Primpke et al. (2018). Finally, all machine-learning models are loaded from the mod directory.

classes = readLines(paste0(ref,"classes.txt"))
data = lapply(classes,function(x){
  print(x)
  specs = read.csv(list.files(ref,full.names=T)[grep(paste("_",x,".csv",sep=""),list.files(ref))],header=T)
  return(specs)
})
data = do.call("rbind",data)
wavenumbers = readRDS(paste0(model,"/wavenumbers.rds"))

wvn = as.numeric(str_remove(names(data)[-ncol(data)],"wvn"))
index = which(wvn %in% wavenumbers)
data = data[,c(index,ncol(data))]

sampleList = list.files(smp,pattern=FORMAT,full.names = TRUE)
if (length(sampleList)==0){
  cat("No samples present in sample directory")
  #quit(status = 1)
}
Nsamples = length(sampleList)
prepSMP = function(x,wvn){
  tmp = read.table(x)
  names(tmp) = c("wavenumbers","reflectance")
  tmp = prospectr::resample2(tmp$reflectance,tmp$wavenumbers,wvn)
  return(tmp)
}

samples = lapply(sampleList,prepSMP,wavenumbers)
samples = as.data.frame(do.call("rbind",samples))
names(samples) = names(data)[-ncol(data)]

dummy = as.matrix(samples)
baselineDummy = baseline(dummy,method="rfbaseline",span=NULL,NoXP=64,maxit=c(10))
#baselineDummy = baseline(dummy, method="rollingBall", wm = 500, ws = 500 )
spectra = getCorrected(baselineDummy)
samples = as.data.frame(spectra)


files = list.files(model, full.names = TRUE)
rfModRaw = readRDS(files[grep("rfModRaw.rds", files)])
rfModSG = readRDS(files[grep("rfModSG.rds", files)])
pcaRaw = readRDS(files[grep("rfModRawPCA.rds", files)])
pcaSG = readRDS(files[grep("rfModSGPCA.rds", files)])
cnnD2 = keras::load_model_hdf5(files[grep("cnnD2",files)])
cnnND2 = keras::load_model_hdf5(files[grep("cnnND2", files)])

Classification

For the decision fusion, the samples are pre-processed according to the expected input for the different models (see here) and the wavenumbers in the C02 window (2200 to 2420 1/cm) are set to 0. Each model is then used to predict an output for the samples. The decision fusion takes places by combining the probability outputs from each model. In addition, all non-synthetic polymer classes are merged to a broader class named OTHER.

 # predicting
  classRFRaw = as.character(stats::predict(rfModRaw, pcaRAW))
  propRFRaw =  stats::predict(rfModRaw, pcaRAW, type = "prob")
  classRFSG = as.character(stats::predict(rfModSG, pcaSG))
  propRFSG = stats::predict(rfModSG, pcaSG, type = "prob")
  classCNND2 = as.character(classes[keras::predict_classes(cnnD2, x_sampleD2)+1])
  propCNND2 = keras::predict_proba(cnnD2, x_sampleD2)
  classCNNND2 = as.character(classes[keras::predict_classes(cnnND2, x_sampleND2)+1])
  propCNNND2 = keras::predict_proba(cnnND2, x_sampleND2)

  # restructuring results
  probs = (propRFRaw + propRFSG + propCNND2 + propCNNND2) / 4
  pred = lapply(1:nrow(probs), function(x){
    which.max(probs[x,])
  })
  predVals = lapply(1:nrow(probs), function(x){
    probs[x,unlist(pred)[x]]
  })
  hits = lapply(1:nrow(probs), function(x){
    hits = sort(probs[x, ], decreasing = T)[1:3]
  })

  predVals = unlist(predVals)
  pred = names(unlist(pred))
  pred[which(pred %in% c("FIBRE","FUR","WOOD"))] = "OTHER"
  results = data.frame(id = ids, class = pred, prob = predVals, level = rep(0,Nsamples))

Generating Output

In conclusion, several outputs are created for the user to assess the classification results. Individual plots with the three classes showing the highest probability are created for each sample. Furthermore, a data frame named results containing information on the level of agreement for the class with the highest probability is written to disk to allow a quick assessment of the classification process. The level of agreement is based on the fused classification probability, labeling probabilities below 0.5 as “no agreement” and increasing the agreement level every 10% up to >0.90 labeled as “very high agreement”. The plots are saved to a plot directory in the current classification directory and can be used to manually assess the classification.

ids = list.files(smp,pattern = FORMAT)
ids = str_remove(ids, FORMAT)

 for (id in 1:length(ids)){
    hit = hits[[id]]
    classes = names(hit)
    values = as.numeric(hit)
    sample = as.data.frame(t(samples[id,]))
    sample$wavenumbers = wavenumbers
    sample[which(wavenumbers<=2420 & wavenumbers>=2200),] = 0
    names(sample) = c( "reflectance", "wavenumbers")
    if(values[1] < .5) level = "no agreement"
    if(values[1] >= .5 & values[1] < .6) level = "very low agreement"
    if(values[1] >= .6 & values[1] < .7) level = "low agreement"
    if(values[1] >= .7 & values[1] < .8) level = "medium agreement"
    if(values[1] >= .8 & values[1] < .9) level = "high agreement"
    if(values[1] >= .9) level = "very high agreement"
    results$level[id] = level

    annotation = paste0(level,"\n",
                        classes[1], ": ", round(values[1], 3), "\n",
                        classes[2], ": ", round(values[2], 3), "\n",
                        classes[3], ": ", round(values[3], 3))
    class1 = samplePlot(data = data, sample = sample, class = classes[1], prob = annotation, name = ids[id])
    class2 = samplePlot(data = data, sample = sample, class = classes[2])
    class3 = samplePlot(data = data, sample = sample, class = classes[3])
    multiclass = gridExtra::grid.arrange(class1,class2,class3)
    ggsave(plot=multiclass,file=paste0(plots,"/",ids[id],"_probClasses.png"),dpi=300,device="png",units="cm",width=50,height=30)
  }
  write.csv(results, paste0(root, "/results_",TIME,".csv"))

Results

Here, we evaluate the decision fusion models by some environmental samples which were provided by Sarah Brüning and Frauke von den Driesch. The overall classification results, as they are written to the results.csv, are presented below.

Tab. 1: Classification results for 14 sample spectra.
X id class prob level
1 120619_W2_1000_1 OTHER 0.3020575 no agreement
2 120619_W2_1000_2 PES 0.3153827 no agreement
3 120619_W2_300_1 PES 0.2533208 no agreement
4 120619_W2_300_2 PUR 0.2301582 no agreement
5 120619_W2_300_3 HDPE 0.8341086 high agreement
6 120619_W2_300_4 HDPE 0.5052317 very low agreement
7 120619_W2_300_5 HDPE 0.6929572 low agreement
8 120619_W2_500_1 PS 0.3061395 no agreement
9 120619_W2_500_2 PES 0.2373230 no agreement
10 120619_W2_500_3 PES 0.2901537 no agreement
11 120619_W2_500_4 HDPE 0.3477586 no agreement
12 120619_W2_500_5 PES 0.2785373 no agreement
13 120619_W2_500_6 PA 0.4990000 no agreement
14 120619_W2_500_7 PES 0.2935073 no agreement

Additionally, two exemplary plots overlaying the spectra of two samples with mean spectra of the classified polymer in the database are shown. Note that the first plot corresponds to row 5 in the table above, the second plot to line 13.

Primpke, S., Wirth, M., Lorenz, C., Gerdts, G., 2018. Reference database design for the automated analysis of microplastic samples based on Fourier transform infrared (FTIR) spectroscopy. Analytical and Bioanalytical Chemistry 410, 5131–5141. https://doi.org/10.1007/s00216-018-1156-x


sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.1

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] plotly_4.9.0              tensorflow_1.14.0        
 [3] abind_1.4-5               e1071_1.7-2              
 [5] keras_2.2.4.1             workflowr_1.4.0.9001     
 [7] baseline_1.2-1            gridExtra_2.3            
 [9] stringr_1.4.0             prospectr_0.1.3          
[11] RcppArmadillo_0.9.600.4.0 openxlsx_4.1.0.1         
[13] magrittr_1.5              ggplot2_3.2.0            
[15] reshape2_1.4.3            dplyr_0.8.3              

loaded via a namespace (and not attached):
 [1] httr_1.4.1         tidyr_0.8.3        jsonlite_1.6      
 [4] viridisLite_0.3.0  foreach_1.4.7      shiny_1.3.2       
 [7] assertthat_0.2.1   highr_0.8          yaml_2.2.0        
[10] pillar_1.4.2       backports_1.1.4    lattice_0.20-38   
[13] glue_1.3.1         reticulate_1.13    digest_0.6.20     
[16] promises_1.0.1     colorspace_1.4-1   htmltools_0.3.6   
[19] httpuv_1.5.1       Matrix_1.2-17      plyr_1.8.4        
[22] pkgconfig_2.0.2    SparseM_1.77       xtable_1.8-4      
[25] purrr_0.3.2        scales_1.0.0       whisker_0.3-2     
[28] later_0.8.0        git2r_0.26.1       tibble_2.1.3      
[31] generics_0.0.2     withr_2.1.2        lazyeval_0.2.2    
[34] mime_0.7           crayon_1.3.4       IDPmisc_1.1.19    
[37] evaluate_0.14      fs_1.3.1           class_7.3-15      
[40] RcppZiggurat_0.1.5 tools_3.6.1        data.table_1.12.2 
[43] munsell_0.5.0      zip_2.0.3          Rfast_1.9.5       
[46] compiler_3.6.1     rlang_0.4.0        grid_3.6.1        
[49] iterators_1.0.12   Rmisc_1.5          htmlwidgets_1.3   
[52] crosstalk_1.0.0    base64enc_0.1-3    labeling_0.3      
[55] rmarkdown_1.14     gtable_0.3.0       codetools_0.2-16  
[58] R6_2.4.0           tfruns_1.4         knitr_1.24        
[61] zeallot_0.1.0      rprojroot_1.3-2    stringi_1.4.3     
[64] parallel_3.6.1     Rcpp_1.0.2         tidyselect_0.2.5  
[67] xfun_0.8