When artificial intelligence goes fishing: On the importance of complete and representative training data of AI-driven competition law enforcement

Jerome De Cooman
University of Liege

Artificial Intelligence (hereafter, ‘AI’) systems are widely adopted by public administrations. Competition law does not escape the rule. This is unsurprising. First, AI systems promise to address well-documented flaws in human decision-making, e.g., arbitrariness or bias. Second, AI systems carries the potential to solve the scissor effect of competition law enforcement identified by the European Court of Auditors (hereafter, ‘ECA’) in 2020, i.e., a decrease of market surveillance capacity on the one hand, and an increase of cases’ complexity on the other [1]. The ECA therefore demanded the European Commission (hereafter, ‘EC’) to put more effort into proactively detecting anticompetitive behaviours. In this regard, it has been suggested AI systems could revitalize ex officio investigations by helping the EC open the ‘right’ investigation [2]. One way to do so is AI-driven cartel screening [3].

AI-driven cartel screening flags abnormal patterns that then trigger the need for further investigations [4]. This paper chooses to focus on dawn raids. In this regard, it should be borne in mind that the duty to state reason applies to dawn raid – at least to some extent. Leaving aside the duty to state specific reason pursuant to Article 20(4) Regulation 1/2003 [5], this paper focuses on the condition proposed by the European Court of Justice (hereafter, ‘ECJ’): for a dawn raid to be legal, the EC has to be in possession of information and evidence providing reasonable grounds for suspecting infringement of competition law by the undertaking concerned [6], and that the statement of reason must not be excessively vague, succinct and generic [7]. The question is, therefore, whether the conclusion of AI-driven cartel screening constitutes such information and evidence providing reasonable grounds.

This is debatable. This algorithmic shift in the fight against cartel faces (at least) one major challenge. AI-driven cartel screening is a data-dependent solution that is, therefore, impacted by problems in the availability and quality of the data it relies on [8]. As a result, the rate of error (both type I and type II) is forecasted to be non-negligible [9]. In the context of copyright infringement [10], it has been held by the ECJ that a filtering system with an inadequate rate of false positive would be contrary to fundamental rights [11]. The identification of the adequate level is, obviously, the stumbling stone of the discussion. Advocate General Henrik Saugmandsgaard Øe suggested in his opinion delivered in Poland v Commission that ‘the error rate should be as low as possible’ [12]. Therefore, whenever it is not possible ‘in the current state of technology (…) to use an automatic filtering tool without resulting in a ‘false positive’ rate that is significant, the use of such a tool should (…) be precluded’ [12]. This paper argues that a similar reasoning holds in the context of competition law and dawn raid.

To solve the twofold data-issue, this paper draws inspiration from the Proposal for a Regulation laying down harmonised rule on AI (hereafter, ‘AI Act’) and suggests technical (based on semi-supervised learning) solutions to ensure that AI, rather than legally justifying a dawn raid, does not become a source of fishing expeditions.


[1] European Court of Auditors, ‘The Commission’s EU merger control and antitrust proceedings: a need to scale up market oversight’ (November 2020) Special Report n°24.

[2] Andreas von Bonin and Sharon Malhi, ‘The Use of Artificial Intelligence in the Future of Competition Law Enforcement’ (2020) 11 Journal of European Competition Law & Practice 468.

[3] Nathalie de Marcellis-Warin, Frédéric Marty and Thierry Warin, ‘Vers un virage algorithmique de la lutte anticartels? Explicabilité et redevabilité à l’aube des algorithmes de surveillance’ (2021) 23 Revue internationale d’éthique sociétale et gouvernementale 1.

[4] Joseph E. Harrington, Jr. And David Imhof, ‘Cartel Screening and Machine Learning’ (2022) 2 Stanford Computational Antitrust 133.

[5] Council Regulation (EC) No 1/2003 of 16 December 2002 on the implementation of the rules on competition laid down in Articles 81 and 82 of the Treaty, OJ L 1, 4 January 2003, 1-25.

[6] Case C-94/00, Roquette Frères, EU:C:2002:603, paras 44-50.

[7] Case C-247/14 P HeidelbergCement AG v European Commission, ECLI:EU:C:2016:149, para 39. This case concerned a request for information under Article 18(3) of Regulation 1/2003 but it has been argued its conclusion may be applied mutatis mutandis to dawn raid. On this, see Helene Andersson, Dawn Raids Under Challenge: Due Process Aspects on the European Commission’s Dawn Raid Practices (Hart Publishing 2018) 86.

[8] Albert Sanchez-Graells, ‘Data-Driven and Digital Procurement Governance: Revisiting Two Well-Known Elephant Tales’ (2019) 24 Communications Laws 157.

[9] For a similar argument made in a (non-algorithmic) cartel screening, see Frederic M. Scherer, Industrial Market Structure and Economic Performance (Rand McNally & Company 1970) 131.

[10] More concretely, it was in the context of the introduction of a system for filtering information stored on an online social networking platform in order to prevent the publication online of files which infringe copyright.

[11] Case C-360/10 Belgische Vereniging van Auteurs, Componisten en Uitgevers CVBA (SABAM) v Netlog NV, EU:C:2012:85, para 50 (emphasis added).

[12] Opinion of Advocate General Saugmandsgaard Øe delivered on 15 July 2021 in Case-401/19 Republic of Poland v European Parliament and the Council of the Eurpean Union, EU:C:2021:613, para 214.