In a series of recent tweets, Yann LeCun, a prominent figure in the field of artificial intelligence, said: shared His experience and insight into the development of the DjVu image compression format and its profound impact on the machine learning (ML) and AI communities. LeCun started the DjVu project at AT&T Labs in the mid-1990s with the goal of creating an efficient way to distribute high-resolution scanned documents over the Internet. Later released in the late 90s/early 00s, the DjVu format was adopted by platforms such as the Internet Archive.
LeCun’s plan to scan and distribute the entire Neural Information Processing (NIPS) conference proceedings further demonstrated the usefulness of this format. With permission from publishers Morgan Kaufman and MIT Press, who had not profited from past proceedings, LeCun and his team succeeded in making these resources widely accessible through a free website by 2000.
This movement has been pivotal in shaping the culture of the ML/AI community toward open access and rapid sharing of preprint publications. Around the same time, community backlash against commercial journal publishers led to the creation of the Journal of Machine Learning Research (JMLR), a free, open-access journal that further supported this trend.
LeCun also talked about an interesting episode with Springer, the for-profit publisher that owned the rights to the first volume of NIPS. Permission for digital distribution was initially denied, but this decision was quickly reversed following a surge in email requests to Springer executives, highlighting the community’s collective influence.
LeCun acknowledged the important role of other contributors to the DjVu project, such as Léon Bottou and Patrick Haffner. The format’s legacy extends beyond academia, influencing projects such as Google’s Book Scanning initiative and the Internet Archive’s Million Books project.
LeCun’s reflections illuminate the evolving dynamics of intellectual property in the digital age and highlight the importance of open access resources in democratizing knowledge and fostering innovation in fields such as machine learning and AI.
Image source: Shutterstock