IPFS stands for: interplanetary file system It is an open source peer-to-peer distributed hypermedia protocol that aims for ubiquitous file system functionality for all computing devices.
This is a complex and extremely ambitious project that will have serious and profound implications for the future development and structure of the Internet as we know it.
quick summary
- The goal of IPFS is to be a distributed file system and hypermedia protocol to solve the problems of the current HTTP-based web, including inefficient content delivery, high costs, centralization leading to censorship, and lack of persistence.
- IPFS allows you to request content from peers on the network instead of downloading it from a central server. This enables efficient deployment, version control history, continuous availability, and content integrity.
- IPFS combines concepts from systems such as BitTorrent, Git, and Kademlia. Content is uniquely identified by a cryptographic hash, allowing for tamper-proof decentralization.
- Merkle DAG allows content to be addressed, deduplicated, and tamper-proofed. The IPNS naming system allows human-readable addressing.
- The benefits of IPFS include much lower storage and bandwidth costs, censorship resistance, faster performance for large data sets, and integration with persistent websites and blockchain networks.
- Use cases include cheaper content delivery, immutable storage and websites, reduced bloat through integration with blockchain transactions, and as a foundational layer for decentralized networks and applications.
Why and how did IPFS start?
The current iteration of the Internet is not as decentralized as ideally and initially perceived. It also assumes some outdated protocols that have caused numerous problems. The problems that IPFS solves center around issues related to the HTTP protocol on the current Internet.
If you’re not familiar with the Internet-related functionality of HTTP, it basically underpins data communication across the entire Internet. HTTP was invented in 1991 and adopted by web browsers in 1996. HTTP essentially sets up how messages are transmitted over the Internet, how browsers respond to commands, and how servers process requests.
Essentially, it is the underlying protocol for how the web is navigated and the protocol backbone of the client-server paradigm.
HTTP vs IPFS, image from MaxCDN
HTTP gave us the Internet we know today, but it is outdated and now, more than 20 years later, its widespread problems are becoming more and more apparent.
A major problem that arises in today’s HTTP implementations is the result of the massive increase in Internet traffic and the resulting amplification of stress points.
The following issues have emerged with the current HTTP implementation:
- Content delivery is inefficient as files are downloaded from one server at a time.
- Storage becomes bloated due to expensive bandwidth costs and file duplication.
- Increased centralization of servers and providers increases Internet censorship.
- The fragile records of information stored on the Internet and the short lifespan of web pages.
- Intermittent connectivity leading to offline development environments and slow connection speeds.
The list of problems goes on, and in an age of technological innovation, it is no surprise that technology that is over 20 years old is becoming noticeably outdated. IPFS provides the distributed storage and file system the Internet needs to achieve its true potential.
In IPFS, instead of downloading a file from a single server, you ask peers on the network to provide a path to the file rather than the file coming from a central server. This enables high-volume data distribution with high efficiency, historical versioning, resilient networks, and continuous availability of content that is secured and verified through cryptographic hashing and distributed across peer networks.
This all looks promising, but how does it work?
How does IPFS work?
Basically, IPFS is a similar concept to the World Wide Web as we know it today, but more similar to a single BitTorrent swarm exchanging objects within a single Git repository.
Files are distributed via a BitTorrent-based protocol. The important thing is that IPFS acts as a kind of combination of Kodemila, BitTorrent, and Git to create a distributed subsystem of the Internet.
The design of the protocol provides for the Internet’s historic versioning, much like Git. Each file and every block within it is given a unique identifier, which is a cryptographic hash. Duplicate entries are removed across the network and version history is tracked for all files.
This leads to continuously available content that won’t disappear from your web pages due to server failure or web host bankruptcy.
This mechanism also ensures the authenticity of the content and when you search for a file you are essentially asking the network to find the node that stores the content behind the unique identifying hash associated with that content.
The links between nodes in IPFS take the form of cryptographic hashes, which is made possible thanks to the Merkle Directed Acycling Graphs (DAG) data architecture. The advantages of Merkle DAG over IPFS are:
- Content Addressing – Content has a unique identifier, which is a cryptographic hash of the file.
- No duplicates – Files with identical content cannot be duplicated and are stored only once.
- Proof of Tampering – Data is verified by a checksum, so if the hash is changed, IPFS knows the data has been tampered with.
IPFS uses Merkle links to link file structures together, and all files can be located by human-readable names using a distributed naming system called IPNS.
The implementation of a Merkle directed acyclic graph (DAGS) is important to the basic functionality of the protocol, but is more technical than the scope of this article.
If you would like to learn more about this aspect of IPFS, you can find much more detailed information on the IPFS Github page and learn more about how Merkle trees work here.
Each node stores only the content it is interested in and indexes information that helps determine who is storing what. The IPFS framework essentially eliminates the need for a centralized server to deliver website content to users.
Ultimately, this concept could make the HTTP protocol completely meaningless and allow users to access content locally and offline. Instead of searching servers like the current Internet infrastructure, users would search for a unique ID (a cryptographic hash), allowing millions of computers, rather than just one server, to deliver files to the user.
Currently, the main implementation of IPFS is in Go, with implementations in both Python and Javascript in progress. Compatible with Linux, MacOSX, Windows, and FreeBSD.
Because it is an open source and community-driven project, you can contribute by following the instructions and documentation on the Github page or by running your own IPFS node.
Use cases and implications
There are already several important use cases for IPFS, and more are sure to emerge as the protocol continues to develop. Delivering a new decentralized peer-to-peer architecture for the Internet comes with complexities, but the benefits can be seen in everything from significant financial savings in storage and bandwidth to integration with decentralized blockchain networks.
The obvious benefits that come with IPFS’ distributed storage model apply to much more efficient data storage and immutability and persistence.
Your website will no longer be subject to periodic 404 error messages due to server downs or HTTP link chain interruptions. Additionally, in terms of efficiency, there are significant benefits to researchers, especially those who need to parse and analyze very large data sets.
As big data becomes more prevalent in modern science, the fast performance and distributed storage of data provided by IPFS are ideal for accelerating progress.
Service providers and content creators can also significantly reduce the costs of providing large amounts of data to their customers. The current iteration of this paradigm is hampered by rising bandwidth costs and data providers charging for peering agreements.
The costs associated with delivering content across centralized infrastructures in interconnected networks continue to increase, and attempts to overcome these burdens are creating significant inefficiencies and a more centralized environment.
Using IPFS, image from Blockchain Mind
Centralization of servers also leads to government snooping, increased proliferation of DDoS attacks, ISP censorship, and selling of personal data.
“Content in IPFS can move through untrusted intermediaries without giving up control of your data or putting it at risk,” said IPFS founder Juan Benet.
Lastly, the integration of IPFS and blockchain technology seems like a perfect fit. Within blockchain transactions, IPFS allows you to place immutable persistent links. Timestamps protect data without actually storing it on-chain, providing a convenient way for a secure off-chain solution that reduces blockchain bloat and supports blockchain scaling.
IPFS is included in a variety of cryptocurrency platforms and has the potential to symbiotically help the industry expand by providing the necessary peer-to-peer and distributed file system architecture as the foundation needed to support the growth of cryptocurrency platforms.
conclusion
As you can see, IPFS is a technically and conceptually complex protocol with grand ambitions to revolutionize data exchange over the Internet.
HTTP was a success in its own right and helped the Internet reach the massive stage it is today. But as new technologies emerged, the need for a reformed, decentralized infrastructure became clear.