Hippogram

Investor Panic Fuels Closed-Source Narrative, but Open-Source AI Foundation Models Will Define the Future

Bart de Witte

03 Jun 2024 • 5 min read

The **Atrium Libertatis in Rome** 3rd century BC (Latin for "House of Freedom")

I received the link to the blog post “The future of foundation models is closed-source” from at least 15 people within my network, highlighting the widespread dissemination of the author's biased blog post. The spread of his blog post, aligns with the bullshit asymmetry principle (Brandolini's Law), which states that the amount of energy needed to refute bullshit is an order of magnitude bigger than to produce it.

Recent research provides a robust, evidence-based counterargument to the blog post's claims, thoroughly debunking the misinformation presented. So here I go again, debunking yet another widely circulated report that opposes one of the best developments we could have hoped for; open sourced AI.

1. Conflict of interest

First of all, we need to put this blog post into context. The fact that this blog post was written by an investor in a closed-source AI company, OpenAI, immediately raises concerns about potential conflicts of interest and biases in the arguments presented. Given this context, one could reasonably conclude that the blog post is a panicked response to the growing momentum and success of open-source AI models- Recent research papers suggest that open source foundation models are posing a credible competitive threat to closed-source alternatives.

Oversimplification of the open-source vs closed-source dichotomy:

Recent research shows that the distinction between "open" and "closed" models is not as clear-cut as the blog post suggests. Most models fall somewhere in the middle, with varying degrees of openness.
The blog post presents a binary view, claiming that open-source and closed-source models cannot both dominate in the long run. However, research indicates that a mix of open and closed models can coexist and contribute to a thriving innovation ecosystem.
Open Source AI is also evolving, and the official definition of what constitutes Open Source AI is still being discussed. The complexity is different from traditional open-source software, as it involves three layers: data, code, and the model itself. Opening the weights of a model under an Apache 2.0 license still contributes significantly to the open ecosystem.

Misunderstanding of the role and incentives of open-source model providers:

The blog post assumes that open-source model providers like Meta are only pursuing "open-source" as a marketing strategy or to commoditize complements. This argument is flawed. For Meta, commoditizing a complement means turning LLMs into tools that help control misinformation while avoiding the creation of new dependencies. Meta's products are heavily reliant on Apple's and Google's mobile platforms. If given the choice, Meta would prefer not to enter into another lock-in and dependency situation. This strategy was successfully deployed by IBM that supported the Apache Software Foundation and contributed to the development of the Apache HTTP Server, which is an open-source web server. By doing so, IBM helped commoditize web server software, a complement to its own hardware and consulting services. This strategy allowed IBM to benefit from the widespread adoption of a reliable, open-source web server while focusing on selling its high-margin hardware and services. This approach also helped counter the dominance of competitors like Microsoft in the web server software market, and this is exactly why the author of the blog is panicking—it's happening again, just as I predicted a few years ago.
However, open-source models can foster innovation commons and dynamic competition, which may be a key strategic consideration for these providers.
The blog post's claim that open-source model providers will eventually shift away from open-source due to cost or safety concerns is not supported by evidence. This wishful thinking by an investor in the closed-source AI company OpenAI is contradicted by research. A systematic analysis of AI foundation model licenses shows that open-source models score much higher on transparency benchmarks compared to their closed-source counterparts. Increased transparency and community involvement in open-source models can address important safety and security issues, refuting the blog post's assertions. Rather than shifting away from open-source, evidence suggests that open-source model providers are likely to maintain or even increase their commitment to openness to foster innovation and competition.

Flawed arguments about the relative merits of open-source vs closed-source models:

The blog post's claims about the cost, quality, and data security of open-source models are not substantiated by any research paper. In the contrary, research suggests that open-source models can provide significant benefits in these areas, especially for smaller developers and users.
The blog post's argument about the national security risks of open-source models is not well-grounded in research. In fact research suggests that closed-source models may actually pose greater risks due to their lack of transparency and community oversight. An open, AI ecosystem could foster multilateral collaboration on AI policy and security issues by creating a common and more transparent baseline from which all can assess the technology.
Contrary to the blog's claim that "China wants free model weights to train their own frontier models," China is unlikely to adopt open-source large language models (LLMs) from the U.S. due to its emphasis on control and censorship to ensure AI models align with national interests. The Chinese government prioritizes developing its own AI models to reduce dependence on foreign technology and maintain greater oversight. Given the intense geopolitical competition with the U.S., China focuses on fostering domestic AI capabilities. Additionally, China's regulatory environment, tailored to its political and social landscape, makes foreign models less compatible. Overall, China's approach to AI governance, which stresses national control and political alignment, makes the use of U.S. open-source LLMs less appealing.
The author's allergic reaction to Europe's socialist tendencies and open sourced AI, show how less informed he is. Open Source has nothing to do with socialism or communism. Open source aligns with liberal democratic values such as transparency, collaboration, and the free exchange of ideas. Open Source AI serves as the digital bedrock for open societies, where information and resources are shared freely, and individuals have the freedom to contribute to and benefit from communal knowledge. Open source embodies these principles by enabling anyone to access, modify, and share, thus promoting a culture of openness and collective progress.
Given the clear conflict of interest, one could argue that the blog post's arguments are colored by the author's vested interest in promoting closed-source AI models, rather than being an objective assessment of the future trajectory of the industry. Current research paper, on the other hand, provide a more impartial and evidence-based perspective on the relative merits and competitive dynamics of open-source and closed-source generative AI.

In conclusion, the blog post's credibility is significantly undermined by the fact that it was written by an investor in a closed-source AI company, suggesting a biased attempt to downplay the growing success and potential of open-source AI models. It reminds us of the tobacco industry historically that promoted cigarettes as healthy, and used advertising campaigns that portrayed smoking as a socially desirable and even healthful activity.

The fact is that open licenses reduce the risk of anticompetitive behavior and oligopoly. Given that the blog's author works for Peter Thiel's Founders Fund, I can't help but recall a line from Thiel's book "Zero to One" - "Competition is for losers." - Anyone opposing open-sourced foundation models is effectively advocating for the centralization of this newly unleashed power within society, and bring us back in the age is dislightenment. If you haven't read Jeremy Howard's contribution yet, I highly recommend doing so.

In summary, the blog post's arguments are not supported by the evidence and insights presented in recent research, that provide a more nuanced, comprehensive, and evidence-based perspective on the role of open-source and closed-source models in the generative AI ecosystem, directly contradicting the blog post's claims. Open-source models are poised to play a crucial role in fostering innovation and competition in the generative AI ecosystem.

Sign up for more like this.