The Future of Wikipedia in the Age of AI
As the use of AI models increases, the way users seek information is evolving. Queries are becoming more complex and conversational, and results are typically based on a much larger body of data, rather than a specific source or page.
As these models become increasingly integrated into our daily lives, the importance of Wikipedia in shaping brand reputation cannot be overstated, since it is a major source for training AIs.
Importance of Wikipedia in AI Training
According to The New York Times, “Wikipedia is probably the most important single source in the training of AI models.” The platform’s vast trove of crowdsourced knowledge, covering a wide range of topics, provides invaluable data for AI models to learn from. Without access to this information, the development of current generative AI capabilities might not have even been possible. (Here’s some additional information on how AIs/LLMs/Chatbots are trained.)
Impact on Brand Reputation
With AI models like ChatGPT, Claude AI, and Gemini having been trained on Wikipedia, inaccurate or biased information on the site can lead to negative or incorrect information about a brand, potentially harming its reputation. With so much riding on the underlying information in Wikipedia, ensuring the positivity and accuracy of a brand’s Wikipedia presence has become more important than ever.
Recommendations
Given Wikipedia’s elevated status, our recommendations for companies, brands, and individuals are to work within the Wikipedia guidelines to do the following:
- Maintain: Create and/or maintain a well-structured, robust Wikipedia page for your brand or personal profile.
- Update Accurately: Make sure the page remains updated and accurate with current facts, figures, and noteworthy achievements.
- Include more sources: Since LLMs utilize all of the content, include as many relevant, verifiable sources, as appropriate – these should only help the AI training.
- Go Multilingual: Consider developing a presence across multiple language editions of Wikipedia. LLMs often learn from content in various languages, and the more you play an active role, the better. Also, consider that English is often the hardest language version of Wikipedia to impact, and other language versions can be very easy to edit.
- Other Wiki pages: LLMs can learn about your brand and industry from any Wikipedia article, so consider getting relevant information added to relevant industry articles, not just the ones about your brand.
- Talk Pages: Leverage Wikipedia’s “Talk” pages to include additional relevant information, as LLMs may also use these for training.
- Images: Consider submitting relevant images via Wikimedia Commons to enhance your Wikipedia page and improve AI model understanding.
- Categorize: Utilize Wikipedia’s category system to ensure your page is properly categorized and connected to the ideal topics.
- Monitor: Monitor your Wikipedia presence for edits that may introduce inaccuracies, outdated information, or bias; address issues appropriately and promptly. Do the same for other relevant pages related to your company or brand. Our free WikiAlerts service provides tracking of Wikipedia and Talk pages.
- Wikidata: Beyond Wikipedia, leverage Wikidata, Wikipedia’s sister project, a powerful database of community-contributed structured data that LLMs will increasingly use to verify facts.
Do We Even Need Wikipedia in a World of AI?
An interesting question that has been raised recently is whether there is even a need for Wikipedia. Since the content is taken from various third-party sources, and the LLMs presumably have access to the sources and probably many more, why can’t an AI produce Wikipedia content that would be as good or better than content created by Wikipedia editors?
To answer this question there have been various attempts to utilize AI to write sections of Wikipedia pages, but so far, despite the great capabilities of AI, they have not been proven to produce content that is up to par. It is possible that this will change at some time in the future, but for now there still seems to be tremendous benefit derived from the human (crowdsourced) process that helps create a Wikipedia page. Perhaps AIs that are trained on this process will eventually produce content that is recognized to be of high enough quality.
Conclusion
Ongoing tracking of how AI models represent your brand, and what role Wikipedia may be playing, can help you identify areas for improvement within Wikipedia and beyond.
As the use of Wikipedia in AI training continues to grow, we believe that the future of brand reputation management will be even more closely tied to Wikipedia. By actively managing their Wikipedia presence, companies can ensure that AI models have access to an important trusted source of accurate and up-to-date information, ultimately leading to a more positive online reputation.
Five Blocks specializes in digital reputation management for platforms including Google and Wikipedia, combining cutting-edge technology and personalized service to help our clients overcome digital reputation challenges. Our advanced data analysis and AI-powered insights allow us to identify the root causes of digital reputation issues and uncover overlooked opportunities for improvement. We work closely with your communications team to develop and implement sustainable solutions that deliver long-lasting results.
For more information or to see what we can do with your brand’s data contact us.