My name is Valerio Giuffrida and I am a Lecturer in Data Science at Edinburgh Napier University. I obtained my Ph.D. from IMT School For Advanced Studies Lucca (supervisor Prof Sotirios A Tsaftaris) based at the University of Edinburgh. I have published several papers on machine learning and plant phenotyping. Specifically, my first work presented a learning algorithm to count leaves in rosette plants. Then, I matured my research interests on neural networks, particularly on Restricted Boltzmann Machines (RBMs). In the 2016, I participate at the enrichment program of The Alan Turing Institute. Besides my scientific skills, I am an excellent programmer, boasting knowledge in several programming languages. In fact, I am the lead developer of the Phenotiki Analysis Software, which bundles computer vision and machine learning algorithms to analyse rosette plants. During my master, I had the possibility to participate at different scientific summer schools, such as the International Computer Vision Summer School (ICVSS) and the Medical Imaging Summer School (MISS), which have been of great motivations towards my scientific career.
My new journal paper "UNSUPERVISED ROTATION FACTORIZATION IN RESTRICTED BOLTZMANN MACHINES" is now available at HTTPS://IEEEXPLORE.IEEE.ORG/DOCUMENT/8870198. This paper shows a new method to learn rotation-invariant features on shallow neural network, without the necessity to use data augmentation. The paper shows mathematical and experimental proofs of its effectiveness.
I have now moved from the UoE to the Napier university as Lecturer in Data Science.
From January 18th to February 2nd, I spent two weeks in Debre Zeyt, Ethiopia, as part of my task for the BBSRC GCF project I am currently working on (further details at http://chickpearoots.org). We provided in-situ support to build a rhizobox imaging station, replicating the setup we had built in Edinburgh. It has been an incredible experience, from the professional and personal point of view. The take-away thought was: international cooperation is important in collaborative research.
The 4th CVPPP workshop will be held in conjunction with BMVC 2018, Newcastle (UK) on the 6th September 2018. More info here!
Finding suitable image representations for the task at hand is critical in computer vision. Different approaches extending the original Restricted Boltzmann Machine (RBM) model have recently been proposed to offer rotation-invariant feature learning. In this paper, we present an extended novel RBM that learns rotation invariant features by explicitly factorizing for rotation nuisance in 2D image inputs within an unsupervised framework. While the goal is to learn invariant features, our model infers an orientation per input image during training, using information related to the reconstruction error. The training process is regularised by a Kullback-Leibler divergence, offering stability and consistency. We used the γ-score, a measure that calculates the amount of invariance, to mathematically and experimentally demonstrate that our approach indeed learns rotation invariant features. We show that our method outperforms the current state-of-the-art RBM approaches for rotation invariant feature learning on three different benchmark datasets, by measuring the performance with the test accuracy of an SVM classifier. Our implementation is available at https://bitbucket.org/tuttoweb/rotinvrbm.
Deep learning is making strides in plant phenotyping and agriculture. But pretrained models require significant adaptation to work on new target datasets originating from a different experiment even on the same species. The current solution is to retrain the model on the new target data implying the need for annotated and labelled images. This paper addresses the problem of adapting a previously trained model on new target but unlabelled images. Our method falls in the broad machine learning problem of domain adaptation, where our aim is to reduce the difference between the source and target dataset (domains). Most classical approaches necessitate that both source and target data are simultaneously available to solve the problem. In agriculture it is possible that source data cannot be shared. Hence, we propose to update the model without necessarily sharing the data of the training source to preserve confidentiality. Our major contribution is a model that reduces the domain shift using an unsupervised adversarial adaptation mechanism on statistics of the training (source) data. In addition, we propose a multi-output training process that (i) allows (quasi-)integer leaf counting predictions; and (ii) improves the accuracy on the target domain, by minimising the distance between the counting distributions on the source and target domain. In our experiments we used a reduced version of the CVPPP dataset as source domain. We performed two sets of experiments, showing domain adaptation in the intra- and inter-species case. Using an Arabidopsis dataset as target domain, the prediction results exhibit a mean squared error (MSE) of 2.3. When a different plant species was used (Komatsuna), the MSE was 1.8.
Root imaging of a growing plant in a non-invasive, affordable, and effective way remains challenging. One approach is to image roots by growing them in a rhizobox, a soil-filled transparent container, imaging them with digital cameras, and segmenting root from soil background. However, due to soil occlusion and the fact that digital imaging is a 2D projection of a 3D object, gaps are present on the segmentation masks, which may hinder the extraction of finely grained root system architecture (RSA) traits. Herein, we develop an image inpainting technique to recover gaps from disconnected root segments. We train a patch-based deep fully convolutional network using a supervised loss but also use adversarial mechanisms at patch and whole root level. We use Policy Gradient method, to endow the model with large-scale whole root view during training. We train our model using synthetic root data.
In our experiments, we show that using adversarial mechanisms at local and whole-root level we obtain a 72% improvement in performance on recovering gaps of real chickpea data when using only patch-level supervision.
Deep learning methods are constantly increasing in popularity and success across a wide range of computer vision applications. However, they are perceived as `black boxes', due to the lack of an intuitive interpretation of their decision processes. We present a study aimed at understanding how Deep Neural Networks (DNN) reach a decision in regression tasks. This study focuses on deep learning approaches in the common plant phenotyping task of leaf counting. We employ Layerwise Relevance Propagation (LRP) and Guided Back Propagation to provide insight into which parts of the input contribute to intermediate layers and the output. We observe that the network largely disregards the background and focuses on the plant during training. More importantly, we found that the leaf blade edges are the most relevant part of the plant for the network model in the counting task. Results are evaluated using a VGG-16 deep neural network on the CVPPP 2017 Leaf Counting Challenge dataset.
The analysis of root system growth, root phenotyping, is important to inform efforts to enhance plant resource acquisition from soils. However, root phenotyping remains challenging due to soil opacity and requires systems that optimize root visibility and image acquisition. Previously reported systems require costly and bespoke materials not available in most countries, where breeders need tools to select varieties best adapted to local soils and field conditions. Here, we present an affordable soil-based growth container (rhizobox) and imaging system to phenotype root development in greenhouses or shelters. All components of the system are made from commodity components, locally available worldwide to facilitate the adoption of this affordable technology in low-income countries. The rhizobox is large enough (~6000 cm2 visible soil) to not restrict vertical root system growth for at least seven weeks after sowing, yet light enough (~21 kg) to be routinely moved manually. Support structures and an imaging station, with five cameras covering the whole soil surface, complement the rhizoboxes. Images are acquired via the Phenotiki sensor interface, collected, stitched and analysed. Root system architecture (RSA) parameters are quantified without intervention. RSA of a dicot (chickpea, Cicer arietinum L.) and a monocot (barley, Hordeum vulgare L.) species, which exhibit contrasting root systems, were analysed. The affordable system is relevant for efforts in Ethiopia and elsewhere to enhance yields and climate resilience of chickpea and other crops for improved food security.
Plant phenotyping refers to the measurement of plant visual traits. In the past, the collection of such traits has been done manually by plant scientists, which is a tedious, error-prone, and time-consuming task. For this reason, image-based plant phenotyping is used to facilitate the measurement of plant traits with algorithms. However, the lack of robust software to extract reliable phenotyping traits from plant images has created a bottleneck.
Here, we will study the problem of estimating the total number of leaves in plant images. The leaf count is a sought-after plant trait, as it is related to the plant development stage, health, yield potential, and flowering time. Previously, leaf counting was determined using a per-leaf segmentation. The typical approaches for per-leaf segmentation are: (i) image processing to segment leaves, using assumptions and heuristics; or (ii) training a neural network. However, both approaches have drawbacks. Heuristics-based approaches use a set of rules based upon observations that can easily fail. Per-leaf segmentation via neural networks requires fine grained annotated datasets during training, which are hard to obtain. Alternatively, the estimation of the number of leaves in an image can be addressed as a direct regression problem. In this context, the learning of the algorithm is relaxed to the prediction of a single number (the leaf count) and the collection of labelled datasets is easy enough to be also performed by non-experts.
This thesis discusses the first machine learning algorithm for leaf counting for top-view rosette plants. This approach extracted patches from the log-polar representation of the image, allowing us to cancel out leaf rotation. These patches were then used to learn a visual dictionary, which was used to encode the image into a holistic descriptor. As a next step, we developed a shallow neural network to extract rotation-invariant features. Using this architecture, we could learn features to explicitly account for the radial arrangement of leaves in rosette plants. Although the results were promising, the leaf counting with rotation-invariant features could not outperform the previous approach.
For this reason, we moved our attention to deep neural networks. However, it is widely known that deep architectures are hungry of data. Therefore, we addressed the problem of how to collect more labelled plant image datasets, using three approaches: (i) we developed an annotation tool to help experts to annotate images; (ii) we uploaded images in a crowdsourcing online platform, allowing citizen scientists to annotate them; (iii) we used a generative deep neural network to synthesise the images of plants with the leaf count. Lastly, we will show how a deep leaf counting network can be trained with data from different sources and modalities, showing promising results and reducing the performance gap between algorithm and human annotators.
Direct observation of morphological plant traits is tedious and a bottleneck for high-throughput phenotyping. Hence, interest in image-based analysis is increasing, requiring software that can reliably extract plant traits, such as leaf count, preferably across a variety of species and growth conditions. However, current leaf counting methods do not work across species or conditions and therefore may broad utility. In this paper, we present Pheno-Deep Counter, a single deep network that can predict leaf count in 2D plant images of different species with rosette-shaped appearance. We demonstrate that our architecture can count leaves from multi-modal 2D images, such as RGB, fluorescence, and near-infrared. Our network design is flexible, allowing for inputs to be added or removed to accommodate new modalities. Furthermore, our architecture can be used as is without requiring dataset-specific customization of the internal structure of the network, opening its use to new scenarios. Pheno-Deep Counter is able to produce accurate predictions in many plant species and, once trained, can count leaves in a few seconds. Through our universal and open source approach to deep counting, we aim to broaden utilization of machine learning based approaches to leaf counting. Our implementation can be downloaded at https://bitbucket.org/tuttoweb/pheno-deep-counter.
Imaging roots of growing plants in a non-invasive and affordable fashion has been a long-standing problem in image-assisted plant breeding and phenotyping. One of the most affordable and diffuse approaches is the use of mesocosms, where plants are grown in soil against a glass surface that permits the roots visualization and imaging. However, due to soil and the fact that the plant root is a 2D projection of a 3D object, parts of the root are occluded. As a result, even under perfect root segmentation, the resulting images contain several gaps that may hinder the extraction of finely grained root system architecture traits.
We propose an effective deep neural network to recover gaps from disconnected root segments. We train a fully supervised encoder-decoder deep CNN that, given an image containing gaps as input, generates an inpainted version, recovering the missing parts. Since in real data ground-truth is lacking, we use synthetic root images that we artificially perturb by introducing gaps to train and evaluate our approach. We show that our network can work both in dicot and monocot cases in reducing root gaps. We also show promising exemplary results in real data from chickpea root architectures.
Background
Image-based plant phenotyping has become a powerful tool in unravelling genotype–environment interactions. The utilization of image analysis and machine learning have become paramount in extracting data stemming from phenotyping experiments. Yet we rely on observer (a human expert) input to perform the phenotyping process. We assume such input to be a ‘gold-standard’ and use it to evaluate software and algorithms and to train learning-based algorithms. However, we should consider whether any variability among experienced and non-experienced (including plain citizens) observers exists. Here we design a study that measures such variability in an annotation task of an integer-quantifiable phenotype: the leaf count.
Results
We compare several experienced and non-experienced observers in annotating leaf counts in images of Arabidopsis Thaliana to measure intra- and inter-observer variability in a controlled study using specially designed annotation tools but also citizens using a distributed citizen-powered web-based platform. In the controlled study observers counted leaves by looking at top-view images, which were taken with low and high resolution optics. We assessed whether the utilization of tools specifically designed for this task can help to reduce such variability. We found that the presence of tools helps to reduce intra-observer variability, and that although intra- and inter-observer variability is present it does not have any effect on longitudinal leaf count trend statistical assessments. We compared the variability of citizen provided annotations (from the web-based platform) and found that plain citizens can provide statistically accurate leaf counts. We also compared a recent machine-learning based leaf counting algorithm and found that while close in performance it is still not within inter-observer variability.
Conclusions
While expertise of the observer plays a role, if sufficient statistical power is present, a collection of non-experienced users and even citizens can be included in image-based phenotyping annotation tasks as long they are suitably designed. We hope with these findings that we can re-evaluate the expectations that we have from automated algorithms: as long as they perform within observer variability they can be considered a suitable alternative. In addition, we hope to invigorate an interest in introducing suitably designed tasks on citizen powered platforms not only to obtain useful information (for research) but to help engage the public in this societal important problem.
The number of leaves a plant has is one of the key traits (phenotypes) describing its development and growth. Here, we propose an automated, deep learning based approach for counting leaves in model rosette plants. While state-of-the-art results on leaf counting with deep learning methods have recently been reported, they obtain the count as a result of leaf segmentation and thus require per-leaf (instance) segmentation to train the models (a rather strong annotation). Instead, our method treats leaf counting as a direct regression problem and thus only requires as annotation the total leaf count per plant. We argue that combining different datasets when training a deep neural network is beneficial and improves the results of the proposed approach. We evaluate our method on the CVPPP 2017 Leaf Counting Challenge dataset, which contains images of Arabidopsis and tobacco plants. Experimental results show that the proposed method significantly outperforms the winner of the previous CVPPP challenge, improving the results by a minimum of ~50% on each of the test datasets, and can achieve this performance without knowing the experimental origin of the data (i.e. in the wild setting of the challenge). We also compare the counting accuracy of our model with that of per leaf segmentation algorithms, achieving a 20% decrease in mean absolute difference in count (|DiC|).
We propose a multi-input multi-output fully convolutional neural network model for MRI synthesis. The model is robust to missing data, as it benefits from, but does not require, additional input modalities. The model is trained end-to-end, and learns to embed all input modalities into a shared modalityinvariant latent space. These latent representations are then combined into a single fused representation, which is transformed into the target output modality with a learnt decoder. We avoid the need for curriculum learning by exploiting the fact that the various input modalities are highly correlated. We also show that by incorporating information from segmentation masks the model can both decrease its error and generate data with synthetic lesions. We evaluate our model on the ISLES and BRATS datasets and demonstrate statistically significant improvements over state-of-the-art methods for single input tasks. This improvement increases further when multiple input modalities are used, demonstrating the benefits of learning a common latent space, again resulting in a statistically significant improvement over the current best method. Lastly, we demonstrate our approach on non skull-stripped brain images, producing a statistically significant improvement over the previous best method. Code is made publicly available at https://github.com/agis85/multimodal brain synthesis.
Learning invariant representations is a critical task in computer vision. In this paper, we propose the Theta-Restricted Boltzmann Machine (?-RBM in short), which builds upon the original RBM formulation and injects the notion of rotation-invariance during the learning procedure. In contrast to previous approaches, we do not transform the training set with all possible rotations. Instead, we rotate the gradient filters when they are computed during the Contrastive Divergence algorithm. We formulate our model as an unfactored gated Boltzmann machine, where another input layer is used to modulate the input visible layer to drive the optimisation procedure. Among our contributions is a mathematical proof that demonstrates that ?-RBM is able to learn rotation-invariant features according to a recently proposed invariance measure. Our method reaches an invariance score of ~90% on mnist-rot dataset, which is the highest result compared with the baseline methods and the current state of the art in transformation-invariant feature learning in RBM. Using an SVM classifier, we also showed that our network learns discriminative features as well, obtaining ~10% of testing error.
Finding suitable features has been an essential problem in computer vision. We focus on Restricted Boltzmann Machines (RBMs), which, despite their versatility, cannot accommodate transformations that may occur in the scene. As a result, several approaches have been proposed that consider a set of transformations, which are used to either augment the training set or transform the actual learned filters. In this paper, we propose the Explicit Rotation-Invariant Restricted Boltzmann Machine, which exploits prior information coming from the dominant orientation of images. Our model extends the standard RBM, byadding a suitable number of weight matrices, associated with each dominant gradient. We show that our approach is able to learn rotation-invariant features, comparing it with the classic formulation of RBM on the MNIST benchmark dataset. Overall, requiring less hidden units, our method learns compact features, which are robust to rotations.
The synthesis of medical images is an intensity transformation of a given modality in a way that represents an acquisition with a different modality (in the context of MRI this represents the synthesis of images originating from different MR sequences). Most methods follow a patch-based approach, which is computationally inefficient during synthesis and requires some sort of ?fusion? to synthesize a whole image from patch-level results. In this paper, we present a whole image synthesis approach that relies on deep neural networks. Our architecture resembles those of encoder-decoder networks, which aims to synthesize a source MRI modality to an other target MRI modality. The proposed method is computationally fast, it doesn?t require extensive amounts of memory, and produces comparable results to recent patch-based approaches.
Counting the number of leaves in plants is important for plant phenotyping, since it can be used to assess plant growth stages. We propose a learning-based approach for counting leaves in rosette (model) plants. We relate image-based descriptors learned in an unsupervised fashion to leaf counts using a supervised regression model. To take advantage of the circular and coplanar arrangement of leaves and also to introduce scale and rotation invariance, we learn features in a log-polar representation. Image patches extracted in this log-polar domain are provided to K-means, which builds a codebook in a unsupervised manner. Feature codes are obtained by projecting patches on the codebook using the triangle encoding, introducing both sparsity and specifically designed representation. A global, per-plant image descriptor is obtained by pooling local features in specific regions of the image. Finally, we provide the global descriptors to a support vector regression framework to estimate the number of leaves in a plant. We evaluate our method on datasets of the \textit{Leaf Counting Challenge} (LCC), containing images of Arabidopsis and tobacco plants. Experimental results show that on average we reduce absolute counting error by 40% w.r.t. the winner of the 2014 edition of the challenge -a counting via segmentation method. When compared to state-of-the-art density-based approaches to counting, on Arabidopsis image data ~75% less counting errors are observed. Our findings suggest that it is possible to treat leaf counting as a regression problem, requiring as input only the total leaf count per training image.
High throughput plant phenotyping is emerging as a necessary step towards meeting agricultural demands of the future. Central to its success is the development of robust computer vision algorithms that analyze images and extract phenotyping information to be associated with genotypes and environmental conditions for identifying traits suitable for further development. Obtaining leaf level quantitative data is important towards understanding better this interaction. While certain efforts have been made to obtain such information in an automated fashion, further innovations are necessary. In this paper we present an annotation tool that can be used to semi-automatically segment leaves in images of rosette plants. This tool, which is designed to exist in a stand-alone fashion but also in cloud based environments, can be used to annotate data directly for the study of plant and leaf growth or to provide annotated datasets for learning-based approaches to extracting phenotypes from images. It relies on an interactive graph-based segmentation algorithm to propagate expert provided priors (in the form of pixels) to the rest of the image, using the random walk formulation to find a good per-leaf segmentation. To evaluate the tool we use standardized datasets available from the LSC and LCC 2015 challenges, achieving an average leaf segmentation accuracy of almost 97% using scribbles as annotations. The tool and source code are publicly available at http://www.phenotiki.com and as a GitHub repository at https://github.com/phenotiki/LeafAnnotationTool.
An interesting and challenging problem in digital image forensics is the identification of the device used to acquire an image. Although the source imaging device can be retrieved exploiting the file?s header (e.g., EXIF), this information can be easily tampered. This lead to the necessity of blind techniques to infer the acquisition device, by processing the content of a given image. Recent studies are concentrated on exploiting sensor pattern noise, or extracting a signature from the set of pictures. In this paper we compare two popular algorithms for the blind camera identification. The first approach extracts a fingerprint from a training set of images, by exploiting the camera sensor?s defects. The second one is based on image features extraction and it assumes that images can be affected by color processing and transformations operated by the camera prior to the storage. For the comparison we used two representative dataset of images acquired, using consumer and mobile cameras respectively. Considering both type of cameras this study is useful to understand whether the theories designed for classic consumer cameras maintain their performances on mobile domain.
When a picture is shot all the information about the distance between the object and the camera gets lost. Depth estimation from a single image is a notable issue in computer vision. In this work we present a hardware and software framework to accomplish the task of 3D measurement through structured light. This technique allows to estimate the depth of the objects, by projecting specific light patterns on the measuring scene. The potentialities of the structured light are well-known in both scientific and industrial contexts. Our framework uses a picoprojector module provided by STMicroelectronics, driven by the designed software projecting time- multiplexing Gray code light patterns. The Gray code is an alternative method to represent binary numbers, ensuring that the hamming distance between two consecutive numbers is always one. Because of this property, this binary coding gives better results for depth estimation task. Many patterns are projected at different time instants, obtaining a dense coding for each pixel. This information is then used to compute the depth for each point in the image. In order to achieve better results, we integrate the depth estimation with the inverted Gray code patterns as well, to compensate projector-camera synchronization problems as well as noise in the scene. Even though our framework is designed for laser picoprojectors, it can be used with conventional image projectors and we present the results for this case too.