Business Intelligence

Crop analysis and fraud detection in agriculture based on satellite data

Time series analysis and classification through cluster analysis for the identification of homogeneous groups of cultures in different areas of Lombardy.

The NDVI vegetational indices were calculated from satellite data at regular 5-day acquisition intervals (Sentinel 2).

The waves of the electromagnetic spectrum detected as a percentage of radiation re-emitted in specific bands, such as those of the near infrared (NIR), red (RED), and short-wave infrared (SWIR), indicate in a combined way the plant health or water stress.

The NDVI indices are therefore available for multiple dates of the same season and allow us to obtain relatively reliable information on the type of crop and its development status:

from NDVI trends analysis it is possible to understand if the crop corresponds to the declared one and if the cultivation methodology is adequate.

In the dataming process carried out on all NDVI time series, the following are of particular interest:

  • analysis and reduction of disturbing factors (e.g. clouds) by average or linear interpolation
  • implementation and comparison of different hierarchical and partitioning clustering algorithms
  • advanced evaluation criteria and analysis of clusters obtained, calculation of cohesion and aggregation measures

Knime, the free and open source analytical platform used for the analysis made possible to carry out a delicate datamining and processing of results also through R integration