DLG Expert report 04/2022
2nd edition, date of issue 03/2024
Author:
Dr Andreas Müller
Physicist and engineer, freelance specialist for food safety and integrity
Telephone: +49 162 320 7692
Andreas.Mueller@stem-in-foodsafety.de
www.stem-in-foodsafety.de
The terms ‘artificial intelligence’ (AI) and ‘machine learning’ (ML) are increasingly subject to inflationary use in the food industry, too, in part for advertising purposes, without the recipients of the information being helped to understand how AI and ML are used for the respective issue and, in particular, why. Do AI and ML, as modern secret ingredients, ensure a crucial competitive advantage even though solutions may already exist? Why are these artificial neural networks better than existing experience and solutions that already exist in practice? Is the structure of the company’s issue even such that it can be processed using AI and ML in the first place? Advertising claims do not usually help to answer these questions. The specified use of AI and ML often remains a mystery. This Expert Knowledge Series publication intends to show how AI and ML function in order to establish a general understanding of these technologies. Various examples of ML and AI applications will subsequently be provided and an outlook for improved risk prevention along the food supply chain will be outlined.
Introduction
The ‘digitalisation’ of information is progressing relentlessly – particularly with the continued spread of smartphones – in both professional and private areas of life. Paper is increasingly being replaced by bits and bytes, and the entirety of humanity’s knowledge is becoming accessible to everyone with the aid of the Internet. The Internet and the free availability of information are presenting users with new challenges, because the good old paper library has sections for non-fiction and sections for fiction. These sections are labelled accordingly and facilitate the classification of reading material. Conversely, the Internet is full of truths, untruths, tactical disinformation, hypotheses, delusions, opinions, fictions and theories, etc. stood alongside one another on an equal footing and undoubtedly without any helpful labelling. What information do we want to use as the basis for making decisions in the future?
As computers and networks become more powerful and with the speed and density of information distribution in language, words and images, terms that were once very popular in the 1950s and then remained in a state of hibernation for decades are also being rediscovered. These terms include ‘artificial intelligence’ (AI) and ‘machine learning’ (ML) as well as implementation vehicles such as the mysterious ‘artificial neural networks’ and ‘data stream-oriented gradient methods’.
These reincarnated terms have also been making their way into the food industry in various forms over the past few years. AI and ML are often advertised as modern ingredients that are claimed to provide a crucial competitive advantage.
Definitions:
Artificial intelligence (AI) is an overarching designation for methods and processes of practical computer science, particularly knowledge processing and knowledge-based systems and in parts of the cognitive sciences. Its objectives are broadly defined. The machine-computational replication of typical human capabilities, particularly intelligence and the ability to make decisions based on experience, are the most far-reaching under the aspect of completely replacing humans in the use of AI.
Source: A. M. Turing: ‘Computing machinery and intelligence.’ In: ‘Computers and thought.’ published by
E. A. Feigenbaum et.al. (New York, 1963).
Machine learning (ML) is a field of practical computer science that is often used in combination with artificial intelligence or is regarded as a sub-area of AI. However, the term ‘machine learning’ was developed independently of the development of ‘artificial intelligence’ and was thought up by Arthur L. Samuel in connection with strategies for winning at draughts. ML is aimed at training computers and enabling them to act without being programmed for a specific situation. Machine learning is a data analysis approach that includes the creation and adaptation of models which enable programs to ‘learn’ through experience. Machine learning involves the development of algorithms that dynamically adapt their models in order to improve their ability to make predictions and therefore improve the success of decisions.
Source: A. L. Samuel. ‘Some Studies in Machine Learning Using the Game of Checkers’. IBM Journal of Research and Development 3:3, 1959, pp. 210–229.
AI and ML – an everyday example
As computers and computer systems become more powerful, AI and ML are becoming increasingly important, because the technical requirements are constantly improving. They are no longer merely a theoretical and academic subject area.
One of the first uses of AI during the second half of the 1950s was letter and number recognition with the aid of optical sensor arrays similar to modern camera sensors, but with fewer pixels, i.e. 20x20 instead of the 6,000x4,000 that are common today. 20x20 was the resolution that was achievable in the laboratory at that time and could be processed with the available computer technologies1.
The recognition of letters and numbers worked very well with standard fonts and optimum contrast ratios. However, even minor deviations from the optimum conditions led to errors, because the optical information that the technology evaluated and transcribed into letters and lines was obviously not unambiguous. Humans evaluated these errors and varied the algorithm such that probabilities of belonging to the actual characters were assigned to typical deviations from the optimum, which in turn dramatically reduced the error rate. This was a combination of ‘human learning’ and a very rudimentary form of AI.
1 See also: yann.lecun.com/exdb/mnist/index.html. The recognition of handwritten numbers is the ‘Hello World’ of artificial intelligence and ‘Chapter 1’ when it comes to neural networks.
Pictures and optical images and their restoration are highly suitable for understanding the way in which AI and ML work. Unfocussed (historical) photos that came about due to inaccurate camera settings or limited technical resources are a prime example of the application of AI and ML.
In the following example case, the blurred portrait of the author is to be improved or made sharper. No information on the camera and its settings when the picture was taken is available. (See Figure 1)
The objective is the detailed reconstruction of the picture using external image material. This means that the intelligent and appropriate blending of high-resolution details from other photos is used to create an overall collage that shows the desired details without losing the visual correlation to the (blurred) original.
This necessitates a high number of high-resolution photos with optimum focus and detail which should resemble the original as closely as possible. This ‘similarity’ can be defined using the following attributes, which the auxiliary photos must have:
- Type of motif (in this case: photo, portrait, frontal)
- Foreground-background brightness distribution
- Gender
- Hair or head covering
- Facial hair ‘beard’
- Approximate age
- Glasses
- Eye colour
- (Others if necessary)
Such an image search could then lead to the following result (see Figure 2, small selection from approx. 612,000 similar hits on the Internet; details blurred due to reasons of data protection).
The quality of the selection can be improved further using simple technical means. The face is the most important distinguishing feature for humans. The technical mapping of facial characteristics (see Figure 3) is carried out by means of connecting lines between distinctive points that every face has, a process referred to as biometrics. The wealth of image material can be very effectively delimited even further if only content with biometric data that corresponds within tightly defined limits is used.
Attention: The legal regulations on data protection and protected personal rights must be complied with under all circumstances when using third-party photos to reconstruct details. The selection of high-resolution photos that is used can be generated and strictly anonymised using relevant and usually chargeable image databases with fully licenced image material, for instance. After processing and application, the data is then irrevocably deleted and the details used for the photo to be reconstructed can no longer be traced back to the original photo. Biometric decomposition was used exclusively for the author’s photo in the example shown here. No other photos were subjected to biometric analysis.
The task of the AI is now to find areas within these high-resolution third-party photos that are suitable for substituting blurred areas in the original photo with high-resolution image material without changing the overall impression of the original photo. Third-party image material should therefore be used to simulate a high resolution. To do this, a grid can be superimposed over the photos and the image content can be compared. However, dynamic subdivision according to biometric fields and areas with numerous contrast variations is far more suitable. The eye and mouth areas that are important for the assessment of the result by humans are checked and simulated in far greater detail than the forehead, the upper parts of the head and the cheek areas, for instance, which have hardly any individual structures.
An important boundary condition now has to be established for any use of AI and ML: the stability or convergence criterion. This has already been mentioned above: ‘the overall impression of the photo should not change as a result of the substitutions’. A machine-usable metric for this requirement could be, for instance:
A reference thumbnail, i.e. a small reference image with a significantly reduced resolution, is generated from the original image by averaging 32x32 pixels in each case (brightness and colour values). The following is required for the AI:
- The biometric distances and their ratios may only differ from the original by a maximum of 5% after any substitution. The following must additionally apply:
- A temporary thumbnail is calculated from the resulting image following each substitution. This may differ from the reference thumbnail by a maximum of 15% in terms of its contrast gradients and by a maximum of 10% from its absolute position within the thumbnail2. Otherwise, the substitution that has been undertaken is discarded.
- After 32 individual substitutions have been undertaken, an assessment is carried out by a human being3.
To express it differently, no use of substitute image material may change the outlines of the original image. Following 32 substitutions that are permissible in this sense, the overall result is assessed by a human. Searching the image memory in the biometric areas for suitable high-resolution areas of faces and inserting these as seamlessly as possible into the (still) blurred original image while observing the stability criteria with adjusted brightness and colour values is a programming task that is easy to implement.
At this point, initial substitutions have been carried out in the original image under consideration of the convergence criteria. The AI now has to be provided with ‘feedback’ concerning the result. Ethics that break ‘good’ and ‘bad’ down in a machine-usable manner have to be defined in order to do this.
Figure 4 shows the result of the first cycle of substitutions and the respective thumbnail. The defined stability criteria are obviously met (picture-in-picture): the thumbnail unmistakably shows the author. The substitutions undertaken in the original image are of very heterogeneous quality. While the mouth and nose are already shown with a high level of detail and extensively free of artefacts, the entire eye area, including the glasses, appears disastrous and unnatural. The upper part of the head and the transition to the neck and chest area have not yet been taken into consideration.
These assessments from the perspective of a human observer now have to be mirrored to the AI. To do this, the changes made in the mouth and lower nose area (identifiable by the corresponding biometric field combinations) are given a high number of points, while the ears and the eye and eyebrow area are given a medium and a low number of points respectively4. In an initial step, this provides information on which was a ‘good’ and which was a ‘bad’ substitution. Each individual substitution can also be quantified by means of a before/after comparison (e.g. totalled contrasts, etc.), with the result that each implemented substitution is associated with a change and an assessment.
2 The selected percentages are dependent on the volume of image material available for the substitutions. The more material is available, the more strictly the limits can be defined. The numbers are essentially arbitrary.
3 The number 32 has been chosen at random in this case. The substitutions should be recognisable and distinguishable for a human being and a differentiated assessment of their suitability should be possible.
4 If programmed as an artificial neural network, the associations leading to the changes in the mouth and nose area would be strengthened. Accordingly, the associations leading to the changes in the eye area would be weakened. In this publication, we use a scoring method or a debit/credit method, i.e. the award or deduction of points for changes similar to movements in a current account.
In the subsequent cycles, the AI will now use the substitutions associated with ‘good’ as the basis for assessing further substitutions. The effect of substitutions that are found to be ‘good’ is reinforced, whereas substitutions that are found to be ‘bad’ are reversed and replaced with another selection, which is then associated with ‘good’. After all, a high volume of image source material is available. After every 32 further substitutions, a new assessment is carried out by a human until a) each biometric area has been substituted and assessed at least once and b) no further ‘bad’ substitutions have been undertaken wherever possible. This is usually the case after approximately five to ten cycles. Thereafter, the process optimises itself autonomously and no longer requires human interaction. The AI’s training is complete, the algorithm runs by itself and, as long as the stability criteria are met, substitutions are also permitted to overlap, i.e. substitutions may be replaced by even better substitutions.
Guided by the training and the ‘good’/’bad’ contrasts, the algorithm will now search through the image memory for possible areas that improve the original image until either a maximum number of cycles is reached (or a defined maximum processing time) or the before/after comparisons no longer achieve any further significant improvement and stagnate (statistical significance analysis). Depending on the (human) consistency in the training phase, it can occur that the algorithm finds several ‘optimum’ solutions during this process. The AI offered two solutions, each consisting of several hundred virtually identical individual substitutions, for the chosen example of the blurred portrait of the author. Figure 5 shows one of the end results compared to the (blurred) original.
Apart from colour and brightness distributions, the right-hand image no longer contains any information from the original photo, but consists entirely of fragments composed from thousands of suitable individual photos using artificial intelligence. Of course, the author recognises subtle differences compared to the original in the mirror, but the AI delivers an astonishingly good result, including unchangeable characteristics (scars), wrinkles in the shirt he’s wearing and the pattern on his tie.
Together with the stability criteria, the trained design of the ethics corridor can now be applied to every portrait virtually without any further training effort. With the information developed up to this point, almost any portrait or facial photo can be reconstructed with astonishing quality. An additional need for training arises for the AI if the image structure differs significantly from that shown here, e.g. in the case of beards or glasses with coloured lenses, because these can influence the biometrics. ‘Remastering’ old photographic and film material using AI has since become a fast-growing industrial sector of its own.
The basic procedure for selecting problems for the use of AI and MI and implementing them is almost always the same, meaning that this method can also be demonstrated on the basis of examples from the extended food industry.
Delimitation of ML and AI from iterative optimisation methods
First of all, it is necessary to delimit the use of ML and AI from numerical and iterative optimisation methods, which are sometimes labelled ‘artificial intelligence’ but have nothing to do with it.
As previously explained, AI is very suitable for intelligently compensating for a lack of information by recognising patterns and specifically extrapolating them by means of alternative sources of information. A common feature of almost all AI applications is that an objective that can be easily described in mathematical terms must exist (the stability criterion) and that, for instance, ethics have been communicated via the ML process that enable the AI to distinguish between good and bad results or to undertake autonomous optimisation. In practice, AI solutions are advertised time and again for issues that are pure optimisation tasks and can therefore be solved using closed-loop mathematics. In this process, the solution is frequently found iteratively in order to keep applications flexible; however, iterative algorithms do not equate to AI and must therefore be used in an entirely conceptually different manner.
Determining the slope and the axis intercept of a best-fit straight line, for instance, is a very simple optimisation task for an effect that results e.g. from experimentally determined measured values behind which the theory postulates a linear relationship (see Figure 6). We therefore already know the function and have much more information than we need to define a straight line; the question is thus overdetermined in the mathematical sense. The optimisation task is then formulated as follows: ‘We are looking for exactly the best-fit straight line for which the sum of the deviation squares5 of the measuring points from the straight line in the y direction becomes minimal’. To solve the problem, the sum is calculated as a function of the axis intercept and the slope.
As we know from school, a minimum means that the first derivative of the function disappears (horizontal tangent). This results in a linear normal equation system: two equations with two unknowns. This can be solved directly in a single step using the methods of school mathematics and delivers the axis intercept and the slope of the best-fit straight line. The problem is hereby solved.
5 The reasons for using the ‘squares of the deviations’ are to be found in probability theory and statistics. When the squares are used, other interesting variables become accessible and can be calculated consistently, such as the correlation of the measuring points or the degree of fulfilment of the hypothesis that a linear relationship genuinely exists here.
The mathematical procedure for non-linear regressions is basically identical, but cannot always be solved using analytical mathematics. A numerical solution functions iteratively by gradually working towards the optimum on the basis of starting values. With very complex functions and issues, this can involve extremely intensive computing. However, the algorithm still fails to meet the criteria for the use of the term ‘artificial intelligence’. The reason for this is that there is no need for a training phase for the algorithms and nor is a ‘good/bad’ ethics corridor necessary as a decision-making basis for automatically generated solution evolutions. The first step in all optimisation tasks should always be to check whether a classic regression approach can be found. If the answer is no, the prerequisites for using AI to solve the problem may possibly be present.
The following chapter lists examples from the extended food industry; while these involve optimisation, the boundary
conditions are constantly changing, with the result that classic optimisation methods cannot be applied without disturbances.
Application of AI in the extended food industry
In the following, we will take the ‘extended’ food industry to mean the value chain or parts of it, but also supporting processes such as quality assurance, logistics and retail. The following are suitable examples of the application of AI methods:
- ‘Sushi-on-demand’ or production of highly individualised themed cakes or similar patisserie products.
- Sample management in service laboratories in the field of food safety.
- Shipping logistics for online food purchase orders.
- Initiatives for reducing food losses and waste in the food retail sector.
At first glance, these four examples appear to be very different because their topics hardly overlap. From the perspective of a user of artificial intelligence, however, the similarity is given, as the architecture of applicable mathematical models is virtually identical although very different terms are assigned to the elements of this architecture. Based on the mathematical discipline of graph theory, such a basic mathematical architecture can be represented schematically as shown in the following example in Figure 7.
This graph representation is schematic and can appear simpler or more complex depending on the issue. Only two basic elements are required: state (symbolised by a circle ¡) and transition from one state to the next state (symbolised by a yellow arrow à). States are characterised by a set of information. Transitions lead to a defined change in this information. The content can be folded up and down depending on the desired or available degree of detail. The structure is similar on all detail levels, i.e. the structure remains the same, but the content is variable.
On the far left of the graph we have an input (green, shown here as just one state for reasons of simplicity). On the far right of the graph we have the desired output (green) generated by the process in between. The graph is always structured such that the overall process of arriving at the output from the input is formalised and mapped as reproducibly as possible. In practice, it will not be possible to process all inputs at the same time and in parallel. A sequence in which the inputs are to be processed must be defined, whereby the inflow of new inputs can be volatile. Whether the processing dimension is time (examples 1 to 3: sushi, laboratory, shipping) or a combination of time and sales price (example 4: reduction of food waste) is irrelevant for the use of AI. It must suggest the processing sequence according to criteria that can be described but are dynamic in terms of their application, subject to consideration of the objective or, in the terminology used by us, subject to consideration of the stability criteria.
The respective inputs, outputs and problem definitions for the AI and the associated stability criteria could be defined as follows for the four examples:
Example 1: sushi on demand or themed cakes made according to customer specifications
Input
Customer purchase order for freshly prepared composition with three possible priorities:
A: immediate (customer is waiting)
B: urgent (purchase order with collection date)
C: plannable (regular purchase order for display or resellers)
Output
Product customised according to customer specifications (shape and quantity) ready for collection on the customer’s preferred date.
Meaning of the graph representation
Detailed breakdown of the individual (manual) production steps up to the compiled end product. Dashed: passing through of bought-in components that do not have to be changed to produce the end product.
Problem definition for the AI
Continuous monitoring of incoming purchase orders. Creation of a dynamic work list for optimised processing of purchase orders subject to adherence to customer’s preferred deadlines.
Stability criterion
All purchase orders should be processed within the available time. It should be possible to specify a completion deadline before the purchase order is accepted if the customer’s preferred deadline cannot be met due to capacity reasons.
Example 2: sample management in a service laboratory
Input
Customer order (sample) with defined scope of testing for various laboratory areas with three possible priorities:
A: immediate (release analysis, produced goods are waiting for laboratory results for sale)
B: urgent (samples received today will be started tomorrow at the latest)
C: plannable (quarterly analyses, sample procurement organised by the laboratory)
Output
Performance of the required analyses and summarisation in an analytical test report within the time period agreed with the customer.
Meaning of the graph representation
Detailed breakdown of the individual process steps in the various laboratory areas, e.g. microbiology, wet chemistry, chromatography. Compilation of the summarising test report. Dashed: context information on the problem that is not included in the test report.
Problem definition for the AI
Continuous monitoring of the incoming samples. Creation of a dynamic work list for optimised processing of the test scopes in the various laboratory areas subject to adherence to the first agreed completion date for all samples.
Stability criterion
All analyses should be carried out within the available time.
Example 3: shipping logistics for online food purchase orders
Input
Customer purchase order for various articles including fresh, chilled and frozen, partly in stock with three possible priorities:
A: immediate (standard for goods in stock)
B: as quickly as possible (ordered articles partially out of stock, repeat order, consolidated delivery)
C: plannable (forward purchase order, stock replenishment for linked distribution centres)
Output
Complete picking of the customer purchase order in suitable transport packaging and provision of the delivery ready for shipping within the time period agreed with the customer.
Meaning of the graph representation
Detailed breakdown of the individual process steps for picking the shipment. Creation of packing lists. Compilation of the ordered articles from the various warehouse areas. Pre-picking in suitable transport boxes (if necessary, compilation of several packages depending on the transport conditions, e.g. room temperature, chilled goods, frozen goods). Determination of dimensions and weights and creation of the shipping documents. Final picking, i.e. ‘close boxes’. Dashed: add information material and promotional items.
Problem definition for the AI
Continuous monitoring of incoming purchase orders. Creation of a dynamic work list for optimised processing of purchase orders subject to adherence to customer’s preferred deadlines.
Stability criterion
All picking should be completed within the available time.
Example 4: reduction of food waste in the food retail sector
Input
Article-related stock and replenishment data with batch-related information on the best-before date (BBD). Subdivision into three issues:
A: very difficult (long delivery time, short BBD, volatile demand)
B: moderately difficult (short delivery time, short BBD, volatile or poss. seasonal demand)
C: simple (short delivery time, long BBD, consistent and less volatile demand)
Output
Dynamic batch-related price corrections for in-stock articles with a short remaining BBD. Early return of partial quantities in the case of surplus stocks anticipated in the long term for the purpose of redistribution or further processing to form long-life products.
Meaning of the graph representation
Detailed breakdown of the articles or goods groups affected by stock changes due to selling off and replenishment. Dashed: any promotional goods or regionally procured goods that are certain to be sold off in full, article groups not affected by the initiative.
Problem definition for the AI
Continuous monitoring of stocks and their batch-related BBDs. Transformation into sell-offs and sell-off forecasts. Generation of price reduction suggestion lists for articles for which residual stock is forecast on the BBD.
Stability criterion
No decline in unit sales of regularly priced goods compared to retrospective unit sales volumes, i.e. no cannibalisation. Maximum sell-off of discounted goods with a target stock level of ‘zero’ on the BBD and sales revenue greater than a threshold value to be defined (e.g. purchase price).
An AI solution can be introduced almost identically for examples 1 to 3; only the terms used would have to be exchanged and reallocated. In the following, the procedure is described in detail using example 2, sample management in the service laboratory.
The more information that is available in the context of the initial situation, the less training effort that is required during the ML phase and the faster the AI delivers stable, consistent solutions. If information is missing, the ML phase is extended and stable solutions may no longer be possible.
For the example of the service laboratory, a very suitable compilation of information consists of the following elements:
- Forecast for the expected inflow of samples with the associated test scopes: number, time of physical arrival, scope of testing.
- Customer’s preferred date for the completion of the analyses or the test report.
- Work instruction for each method to be performed (i.e. an illustration of the individual sub-graph in the graph representation).
- Time stamp for each process step in the laboratory information management system (LIMS).
- Personnel tie-up time and qualification required for each process step.
- Total working hours available per qualification and associated working hour corridors.
- Capacity and utilisation of the instruments, auxiliary tools and areas used.
- (Further information that supports active production steering)
Helpful key performance indicators (KPIs) are processing times per laboratory area, processing time as a percentage of the technically possible minimum processing time (always ≥100%), capacity utilisation indices for machines and available laboratory staff, degree of fulfilment in the sense of the output target, etc. At the latest, the time required for completion under ideal conditions can be calculated using this information when the sample arrives. Residence times, processing times including transport times in the laboratory as well as the usage intensities of the resources required in the various laboratory areas are known. The overall process can be visualised as a graph in the sense described above; a logical and machine-usable map of the ideal process that forms the core of the subsequent AI control system is therefore available.
In content terms, the implementation of an AI-controlled, dynamic worklist suggestion tool comprises the following six steps, whereby technical implementation can and should be linked to the LIMS that is used in order to enable all functionalities to be integrated unspectacularly into a user interface for users. Steps 1-5 should be carried out strictly in a test environment. Transfer or connection to the respective productive system only takes place in the final step.
- Production attributes (e.g. travel times, processing times, waiting times, reserved capacities, set-up times, automation options, etc.) and production restrictions (e.g. available working hours per qualification) and their possible combinations are systematically varied in order to determine the influence of attribute/restriction combinations on the KPIs. This is carried out on the basis of the graph representation, i.e. on the basis of the logical map of the process and hypothetical test scopes that specifically activate one or more process paths within the graph representation. Note: the attributes and restrictions that have an influence on the KPIs should be found in this step. Depending on the issue, these attributes can also be indirect and concealed or can consist of combinations.
- With the complete set of variable attributes and restrictions, a sufficiently large pool of random samples is now simulated against all KPIs in the graph representation. Ideally, this involves a pool of real orders for which the actual KPIs and elements of the graph representation are known or can be reconstructed. This retrospective view immediately enables the variations carried out in the attributes and restrictions to be assessed: ‘how long would the throughput time and personnel tie-up times have been if …’. In particular, benchmarking is carried out for each of these virtual order processing operations under hypothetical new conditions with regard to delivery reliability, i.e. adherence to the processing time promised to the customer. The result is a suggestion list in which the samples in the fixed pool should have been processed.
- The ethics corridor is now defined. The training phase or the machine learning phase begins. Which variations were good, which were bad? After each cycle (one variation, all samples in the pool), the system is informed whether or not the defined set of KPIs has improved. Attention 1: statements are always made on the complete set of KPIs, not on individual, selected KPIs. Whilst the assessment can be graded (e.g. according to a scoring system), all KPIs must be assessed and the set of KPIs should not be changed. The system requires these degrees of freedom and this flexibility later on. Attention 2: the AI should emulate human decision-making capability. To do this, the KPIs must be assessed as good or bad. Indifferent decision-making behaviour leads to instability. If the human is uncertain with regard to the assessment, the AI will always require further training before it delivers stable results. The result of the ML phase is that the system will favour certain variation systems but assign less weight to others.
- The human now shifts into the background as an assessment authority. The system now iteratively carries out (numerous) variations of the attributes and restrictions, observing the trained good/bad criteria. The pool of samples and the retrospective analysis of the result remain unaffected. The human monitors the behaviour of the variations and their effect on the KPIs. The system has been trained adequately if convergent solutions are delivered, i.e. on average, continuous improvements in the KPIs up to an optimum. Divergent behaviour would be, for example, a deterioration in the KPIs or unrealistic solutions, e.g. the abandonment of the qualification mix for processing (all work should only be carried out by one qualification with simultaneous underloading of all other qualifications). The system should stabilise itself; otherwise, the training is not yet adequate or the selected random sample of orders is not sufficiently representative.
- The dress rehearsal now takes place. We remain in the retrospective approach, but bid farewell to the fixed sample pool. We now permit dynamisation according to the receipt of samples in a past period to be defined (e.g. the entire past calendar year). The system should now deliver a stable solution at all points in time. Ideally, this would have delivered better KPIs at all points in time compared to the KPIs actually achieved. The result is now an AI-generated suggestion list for the sequence of order processing. If this dress rehearsal would have worked well over a longer period of time (e.g. one year) in retrospect, the AI can enter production.
- Transfer of the AI to the productive system. Use of the generated work lists as an essential control tool for ‘production’. Automated feedback of the KPIs actually achieved to the AI for automated refinement and adaptation to the volatility of daily business. Attention: the production-near use of AI methods must always be embedded in a quality management process of continuous improvement.
An additional benefit of this approach is that bottlenecks and drivers for resource waste are made visible. Capacity reserves can be derived from the results obtained in point 5). From these, a company can in turn derive a suggestion system for process improvements (e.g. shortening travel times, optimising the use of qualifications, exploiting available time corridors, implementing automation, etc.). If such changes are implemented, the graph representation must also be adapted. Whether a new training phase is necessary for the AI or whether the changes to the graph can be transferred directly to the productive system must then be checked in the test environment. The flexibility of the laboratory in crisis situations can also be simulated, e.g. a dramatic surplus of priority A (‘immediate’) samples or temporary capacity limitations due to illness, conversions or validation work for new equipment, etc.
Reduction of food waste in the food retail sector (outline)
According to a study conducted by the Thünen Institute in 20196, around 12 million tonnes of food are destroyed in Germany every year. More than half of this waste occurs at the end consumer. Only 4% of the total volume are attributable to the food retail sector itself. Taking the associated value chains from primary production and processing to retail into consideration, however, the potential for reducing food waste is significant, as shown in Table 1.
The method described so far is to be applied at least in part with adapted terms to example 4, reduction of food waste in the food retail sector. In contrast to examples 1) to 3), however, the process to be optimised by means of AI is not located entirely within one’s own company. While shifting the optimisation upstream towards the supplier industry would appear to make sense, the options for calling off small quantities arbitrarily and at short notice or implementing just-in-time and customised lead time programmes are limited. The related costs often increase disproportionately and sales forecasts are uncertain; while waste no longer occurs at the point of sale, it may possibly do so in upstream stages of the supply chain. Efforts must essentially be made to achieve a holistic solution that encompasses the supply chain.
Nevertheless, AI approaches can lead to significant improvements in terms of reducing surplus quantities insofar as upstream suppliers are also included and the basic function of the retail sector (bridging time, quantities and distances) is maintained for the end consumer at the same time. A complete graph representation (see Figure 7) for this issue would be prohibitively complex due to the multitude of necessary information providers. Added to this is the fact that success is extensively dependent on forecast data, particularly on consumer behaviour (unit sales forecasts) and also on anticipated harvest volumes for agricultural products. In addition, unit sales forecasts for short periods of time must also be distributed laterally, i.e. established on a regional basis, as goods cannot simply be redistributed logistically (due to cost and also sustainability reasons) in the event of deviations from regional forecasts.
For a holistic solution, both regionally specific circumstances and macroeconomic correlations and technical options as well as their costs and energy balances would therefore have to be adequately taken into consideration, with a partially intransparent information situation at the same time. All this cannot be compensated for by computing power, as the fundamentals for establishing robust models and simulations are not consistently ensured.
The approach now outlined for a partial solution makes sense and is also applied in variations in pilot projects. The overall task is broken down into individual strands in the graph representation until the sub-task is completely described and assigned with information. To do this, a goods group that fulfils the following characteristics is selected:
- BBD mix in the product range. Limited shelf life for at least some of the articles in high demand (attention: no use-by date).
- Continuous temporal and regional availability.
- Consumer-side, national basic demand for products in the goods group. Available nationally at points of sale.
- Mechanisms for fluctuations in demand are known and easy to anticipate.
- Articles in the goods group are not processed extensively and can be traced back to a basic product from primary production wherever possible.
The goods groups that are suitable from this perspective are milk and dairy products with a wide range of BBD intervals on the one hand (from fresh milk and fermented products up to and including condensed milk and spray-dried milk powders) and seasonally increased demand for a product range selection on the other hand (e.g. drinking yoghurt, buttermilk products but also ice cream varieties with milk or cream content). Milk and dairy products are available nationally in every full-range supermarket. The related value chains for production within the Community or in Germany are adequately transparent.
7 Although dairy ice cream, for instance, has to contain at least 70% milk, ice cream is often listed as a separate goods group or is classed as ‘confectionery’, for example. Inclusion in the AI process would be optional but advisable in content terms.
The objective of using AI can now pursue two strategies (possibly also together): improved planning of production volumes in combination with the timely return of surpluses before the expiry date for the purpose of further processing into a further processed product or the selling off of surplus goods. Essentially, there are two possible mechanisms for this: a) dynamisation of the best-before date and thus the extension of the temporal range of saleable residual stocks at the same price or b) a reduction in price as the fixed best-before date approaches for the purpose of establishing an additional incentive for consumers to buy. Active initiatives work exclusively with option b), as the dynamisation of the best-before date is hardly feasible in practice today.
For products from the goods group that cannot be returned for further processing due to a short maximum shelf life, the issue of ‘additional purchase incentive through price reduction’ can be isolated from an AI perspective. To do this, the remaining shelf life and the residual quantity with a given best-before date as of which the price reduction should come into effect must be determined for each article. Forecasts are required for this. Amongst other factors, the following must be taken into consideration at regional level:
- Retrospective sales figures and their fluctuations in the dimensions of quantity and value, but also the number of purchase processes as an indicator of the number of customers.
- Retrospective information on volumes disposed of per article (reference value for controlling KPIs).
- Remaining stocks in the warehouse and on the store shelf (BBD-related).
- Other information from inventory management (e.g. volumes and arrival dates of resupplied fresh goods).
For the short-term range of goods with a short BBD, sell-offs must also be forecast in the short term and regionally, and the dependency of the volumes sold off on a price reduction must be modelled. The following can be taken into consideration for this, for instance:
- Level of the price reduction.
- Weekends, public holidays and bridging days, school holidays during the forecast period.
- Weather conditions anticipated during the forecast period.
- Mass events planned during the forecast period.
The AI would now vary the time periods for price reductions and the level of the price reduction and would have to be trained empirically. A sell-off rate of goods shortly before the BBD that lies statistically outside of the previous fluctuation ranges of the sell-off figures would be ‘good’, with the result that statistically significantly fewer goods have to be disposed of. Any trend that indicates the occurrence of cannibalisation or substitution effects would be ‘bad’. That would mean that consumers wait for the price reduction and reduce the quantity of goods purchased at the regular price. Or that consumers switch to other products.
This issue could also be formulated as an optimisation task for milk and dairy products in isolation, i.e. without the use of AI. However, the benefit of using AI arises when correlations are extrapolated to other goods groups and information gaps in the value chains can be bridged by means of AI.
As soon as primary production, processing companies and other upstream suppliers are included in the waste avoidance model, it is even easier to use AI than to transfer the overall issue to an optimisation task. Due to reasons of space, the possible application areas of AI can only be outlined here.
AI application areas in the area of primary production:
Avoidance of overproduction
- Flexibilisation of production volumes – dynamic substitution adapted to demand.
- Long-term demand forecasts (basic demand, trends, positive ‘issues’, statutory regulations).
- Medium-term demand forecasts (weather, events, competitive situation, campaigns).
- Breakdown of production factors into extrinsic (weather, geogenic factors, pests, statutory regulations) and intrinsic contributions (region, technology, qualification, scope for alternative products).
AI application areas in the area of processing:
Avoidance of overproduction
- Flexibilisation of the product range – distribution of primary raw materials to goods with various BBD classes.
- Dynamic forecasts for demand and unit sales volumes per supplied region.
- Feedback of unit sales forecasts to production volume control for each BBD class.
- Feedback of unit sales forecasts to stock quantity planning for each BBD class.
- Optimisation of the utilisation of returned goods.
AI application areas in the retail sector (without sales price adjustment):
Improvement of the reliability of unit sales forecasts
- Volume forecasts per article in the regular repurchasing rhythm.
- Dynamic regional volume forecasts per article, taking public holidays, vacation periods, events and weather into consideration.
- Reverse feedback of the information into the value chain.
Pilot projects that are also being officially sponsored are currently being conducted on the use of AI to reduce food waste in the value chain.
Outlook: application of AI in risk prevention for food
The use of ML and AI is also conceivable in the area of risk prevention and early warning of potential risks across several stages of a value chain. The first consistent models have been in existence since the beginning of 2022, but have not yet been published. The procedure will be explained here and the use of AI encouraged based on the example of a mass product that is in high demand.
The example product is a fictitious monolithic chocolate with pieces of biscuit, ‘milk chocolate with pieces of biscuit’. In the graph representation that has now been used multiple times, a vastly simplified international value chain could appear like the one shown in Figure 9.
Via numerous process steps (arrows), intermediate stages and refinements (circles), the primary ingredients (left) are ultimately transformed into the end product (right) that makes it way to retailers. The transparency of the value chain for the manufacturer of the end product decreases rapidly towards the primary ingredients, and the available information becomes increasingly general and less specific. Incidents that occur early on in the value chain often have to take effect over several stages before they are recognised and can be contained. Occasionally, these effects also reach the end product or the preceding stage. In such a case, causal analysis and damage clarification have to be undertaken at significant expense. Non-conforming goods have to be destroyed under certain circumstances.
The questions for an AI approach are now:
a) Can the value chain be modelled appropriately with respect to risk prevention?
b) Is information available that could serve as supporting or reference points for the modelling?
c) Can risk effects that have been clarified in the past be used to prevent the undetected transfer of risks across multiple stages of the value chain?
To answer these questions, meanings and information have to be assigned to the elements in the graph representation in an appropriate manner.
We start with the circles (see Figure 10). These designate states in which the product finds itself at this stage of the value chain. Appropriate terms would be: fermented, peeled, mixed, de-oiled, bleached, packaged, stored, etc. A risk profile is assigned to each state. This is usually a vector whose components consist of analytically accessible, measurable variables. The value-adding and value-reducing characteristics can be represented as analytical parameters: e.g. protein content, fat content, theobromine content, cadmium content, mould contamination, ochratoxin-A contamination, etc.
Each arrow designates a form of processing that transfers the respective product from one state to the next state (see Figure 11). Suitable terms would be: harvesting, peeling, storing, grinding, mixing, heating, transporting, colouring, etc. Mathematically, the arrow would be an operator that transforms one vector into the following vector. Sterilisation heating, for example, would be able to kill a microbe (operator ‘zero’), but would not change the content of mould toxins (operator ‘one’).
The n-fold concatenation of these basic status-processing elements with the associated influencing ‘inheritance’ of the value-adding and value-reducing characteristics forms the value chain (see Figure 12).
Two challenges remain:
Measurement data is by no means available for all states and is often not even continuously available along the value chain as yet. States for which no measurement data is available could be approximately calculated in the context of modelling and made a partial aspect of the use of AI. Quite apart from the issue of data quality, however, the consistent use of blockchains can equate to a significant technological leap forwards and can enable and stimulate the use of AI in prevention.
The second challenge is the precise mathematical circumscription of the transformation operators (arrows). Formulated as a question: how do my laboratory analytical parameters change as a result of the processing step, transport, storage? Answering this question is by no means a trivial matter and requires an in-depth understanding of food technology. However, preliminary studies show that only approximately 300 of around 2,000 possible operators8 are used regularly in practice, which significantly reduces complexity in this area.
The AI will now focus on modelling the entire value chain – insofar as it is known or can be reconstructed – and will simulate the effects of disturbances during the early value creation stages on the end product and visualise indicators in the form of expected changes in the measurement results, on the basis of which a risk can be identified. The available real measurement data, which must be reliably reproduced during each modelling process and which serves as supporting points for the algorithm, is regarded as a stability criterion.
8 Knowledge status June 2022.
The ethics corridor is established by reconstructing known and clarified defects and effects of clearly described circumstances that have led to a risk. A successful reconstruction is ‘good’. Unrealistic or ‘panicky9’ modelling results are ‘bad’.
Once the results have stabilised reliably, the criteria and methods that have been found can be applied to other products. In this case, the AI will search for similar risk propagation patterns; this is merely a matter of hard work based on the known state vectors and processing operators and something that is eliminated during the subsequent training of the AI. It was shown in test modelling, for instance, that the case of ‘noroviruses in frozen strawberries’ that had been seen in the past followed an almost identical mechanism to the case of ‘hepatitis viruses in dates’ that occurred a few years later. In this retrospective view, the second case could have been anticipated and countermeasures implemented very early on.
The mathematical mapping is very clear and works with established techniques that have been known for a long while. This is referred to as a static overview. However, the algorithmic implementation of such a prevention analysis based on AI techniques is complex because the necessary information matrix is very sparsely populated with supporting points and, on closer analysis, the value chains reveal a vast number of states and processing operations. Humans are no longer able to understand or trace the paths and intermediate results of the AI modelling. Humans focus their intellectual capacities on stability and convergence and on the definition of the ethics corridor. In practice, AI systems of this order of magnitude are dynamically unclear.
Assuming that the example chocolate is produced in Germany, the sugar alone can come from approximately 140 countries, some of which have a highly fragmented producer structure10. If such a detailed breakdown is undertaken for all of the specified basic ingredients, approximately 90,000 states and processing operations or transitions are obtained. A realistic graph representation for milk chocolate with pieces of biscuit would then appear as shown in Figure 13.
Numerous flanking programming techniques have to be implemented here in order to keep computing time and memory requirements within certain limits and therefore, as a secondary factor, also the downstream energy requirements of such modelling. Incidentally, when used in risk prevention, the AI shares this aspiration to use resources efficiently with the technical implementation of a blockchain.
9 As the assignment of suitable information to the individual elements of the value chain is very weak, i.e. a great deal of information is not available and has to be intrapolated, divergences always occur at the beginning of the training phase simply because of the stability of the numerical methods. No risk, however disastrous it may be, reaches the end product or – much more frequently – every disturbance, no matter how small, blows up into a maximum risk for the end product due to ‘panic’. In the sense of the ethics corridor, these solutions are ‚bad‘.
10 Source: Dr Jörg Klinkmann (Storck), Food Safety Congress 2017 in Berlin.
Summary
With the vast improvement in the performance of affordable computers over the last two decades, artificial intelligence is undergoing a renaissance. The intuitively understandable example of photo restoration was used to illustrate the elementary steps for systematically implementing AI methods and to demonstrate their delimitation from iterative optimisation methods. This is intended to provide readers with a tool for also assessing issues in their own company from an AI perspective. Following the same basic pattern, the now ‘demystified’ AI can be successfully used to solve very different issues within the food industry.
The area of predictive risk prevention based on AI-supported modelling offers great potential not only to avert risks, but also to serve as a tool to prevent the further processing of defective goods, which is more desirable in terms of sustainability aspects than sorting good and defective goods. The procurement of credible, comprehensive detailed information is increasingly shifting into focus, something for which blockchain technology can become established as an excellent vehicle for creating data transparency. The technical implementation of very complex issues, which may be statically clear but turns out to be complex within the company, makes high technical demands on the skills of the developers and necessitates a routine with which techniques from the big data environment can be applied. The potential for safety gains and improved sustainability in the food industry is enormous. Although many of the approaches that currently exist are still in the research and development stage, the road to successful application and gaining the trust of users in this new old technology has already been prepared and broad-based implementation is likely in the near future.
Contact
Simone Schiller - Managing Director of DLG Competence Center Food (DLG-Fachzentrum Lebensmittel) - Tel: +49 69 24788-390 S.Schiller@DLG.org