The Supreme Court of Colombia has dismissed a cassation appeal on the grounds that the filing was generated by artificial intelligence, a ruling that immediately unraveled when the court's own decision was subjected to the same scrutiny. The court utilized Winston AI to analyze the appellant's brief, which returned a score of just 7% human content, leading the justices to declare the document inadmissible. However, a subsequent analysis by Attorney Emmanuel Alessio Velasquez revealed that the court's own ruling, Auto AP760/2026, registered 93% AI-generated text when run through the identical software, exposing a critical methodological contradiction in the judicial system's reliance on automated detection tools.

The Hypocrisy of Algorithmic Adjudication

The dismissal of the appeal was based on the premise that the document lacked human authorship, a determination made solely by a third-party algorithm. Velasquez's counter-analysis demonstrates the fragility of using statistical probability as a definitive legal standard. The court's logic collapsed under its own weight: if a document condemning AI use scores 93% as machine-generated, the tool used to make that determination cannot be trusted to distinguish between human and artificial writing. The situation highlights a profound hypocrisy where the judiciary employs AI to police AI, despite the technology's inability to reliably identify its own output.

Further testing by the publication using GPTZero yielded equally inconsistent results. When the court's ruling was scanned solely by its opening words, the tool flagged it as 100% AI-generated. Yet, when the scan included the full factual background, the result reversed to 100% human. This volatility suggests that AI detectors are highly sensitive to text length and context, rendering them unreliable for binary legal decisions. Other Colombian attorneys have replicated these findings, submitting documents from years prior to the advent of large language models that were still flagged as machine-generated.

False Positives and the Erosion of Trust

The Colombian case is not an isolated incident but part of a broader global failure of AI detection technology. The underlying mechanism of these tools relies on measuring statistical patterns such as sentence length, vocabulary predictability, and "burstiness." These metrics are notoriously prone to error, particularly when applied to non-native English speakers or specific writing styles. A 2023 study published in the journal Patterns found that over 61% of TOEFL essays by non-native English speakers were incorrectly flagged as AI-generated. A systematic review by Weber-Wulff concluded that no currently available tool is precise or reliable enough for high-stakes applications.

The financial and educational sectors have already begun retreating from these technologies due to the high cost of false positives. Turnitin, a leading academic integrity provider, acknowledged in June 2023 that its detector produced significantly higher false positive rates when AI content fell below 20%. Consequently, Vanderbilt University disabled the tool in 2023 after estimating 3,000 false positives annually. Similarly, the University of Arizona removed AI-detection features after a student lost 20% of a grade based on an erroneous flag. In a 2024 case at UC Davis, 17 linguistics students were flagged, 15 of whom were non-native English speakers, further underscoring the demographic bias inherent in these algorithms.

Even the creators of the technology have abandoned the effort. OpenAI removed its own AI detection tool due to persistent inaccuracies. The source code behind these platforms remains non-public, preventing independent verification of their logic. As legal experts note, the court's reliance on a black-box algorithm to determine the admissibility of evidence sets a dangerous precedent. The court's attempt to enforce human authorship through a tool that cannot distinguish human from machine writing ultimately undermines the very integrity it seeks to protect.

As markets react to the broader implications of regulatory overreach and technological unreliability, major indices showed weakness. The S&P 500 fell 0.9% to 6,817, the Dow Jones dropped 0.8% to 48,501, and the Nasdaq declined 1.0% to 22,517. While these movements reflect broader market sentiment, the Colombian ruling serves as a stark warning to regulators and institutions globally: until AI detection tools achieve verifiable accuracy, their use in legal and administrative proceedings remains a gamble with potentially irreversible consequences.

Source: Decrypt | Analysis by Rumour Team