Know Your Net

Web 1, 2, 3... why do people refer to the World Wide Web with versions?

Show Notes

The World Wide Web didn't exist at the beginning of the internet. It started as a proof of concept and then it started to evolve. Tech writers have categorized eras of the web into versions. This explores each version.

Transcripts can be found here.

What is Know Your Net?

A look at the history of the internet to provide context for today's technology ecosystem. From the birth of the internet to the evolution of the hellscape that is social media today.

The World Wide Web has become synonymous with the internet, yet it’s really just one small but very visible part of the internet. How did that come into being?

The invention of the World Wide Web starts in Switzerland at the CERN research lab. You may be familiar with CERN if you follow news on the Large Hadron Collider and the search for the god particle.

While Tim Berners-Lee worked at CERN, he wanted to facilitate sharing and updating information among researchers. He took the idea of hypertext from Douglas Engelbart and connected it to the Transmission Control Protocol and domain name system to come up with HTTP (Hypertext Transmission Protocol) in 1989.

He built the first website and put it online August 6, 1991. The website is extremely basic by today’s standards but revolutionized the internet. In 1994 he founded the W3C at MIT, composed of various companies that were willing to create standards and recommendations to improve the quality of the web. He made his idea available freely with no patent or royalties due. This way they could be adopted by anyone.

The W3C is the World Wide Web Consortium. It’s a group of experts that set standards for the web. When it comes to the internet today, this is pretty much how leadership structures look for all of the underlying technology. For example, Steve Crocker was instrumental in creating the ARPA “Network Working Group” which would lead to the Internet Engineering Task Force in 1986. The IETF has no formal membership. It’s mission is to develop and promote voluntary internet protocol standards.

Request for Comments is a type of publication from the technology community. It’s a memorandum describing methods, behaviors, research or innovations that are applicable to the working of the Internet. A memo is submitted either for peer review or to convey new concepts or information.

Internet specifications start as RFCs. Early RFCs were actual requests for comments. The pioneers of the internet avoided sounding too declarative and wanted to generate discussion. It leaves questions open and is less formal in style. Jon Postel is one pioneer primarily known for his contributions to Internet standards. He was the Editor for RFCs. In 1980 he proposed the Mail Transfer Protocol, known now as SMTP. It is the standard protocol for internet email.

Jon Postel also is known for “Postel’s Law” an implementation should be conservative in its sending behavior, and liberal in its receiving behavior.”

This was reworded later to “Be liberal in what you accept, and conservative in what you send.”

The first generation of the web is now referred to as Web 1.0.

Does the World Wide Web have versions? Not really. But critics and commentators like to categorize eras in the development of the web to denote sea changes in how the web is utilized and what tech companies are focusing on.

The focus for Web 1.0 was on building the web, making it accessible, and seeing how it could be commercialized. This is when Internet Service Providers, or ISPs grow and mature. Protocols are developed and there is a push for open standards. It’s all new too, so in order to build a better web, developers need tools, thus software for developers is also a key area of interest.

A key development that happens during this time and continues today is the idea of a web service. A Web Service is a technology that is utilized for machine-to-machine communication. Today much of web service development is devoted to accessing databases.

A significant early web service is RSS, or Really Simple Syndication, developed by Dave Winer along with Dan Libby and Aaron Swartz in the late 1990s. RSS uses plain text and HTTP, the open protocol developed by Tim Berners Lee, to syndicate content. The development of RSS allows blogging to become much more accessible. Bloggers are able to join syndication networks. A blogger can create what’s known as an RSS feed that links to each blog post, and that RSS feed can be read on another website, allowing news oriented sites to have fresh content. Subscriber sites add greater depth and immediacy of information to pages. This increases exposure, making blogs easier to find, and generates new traffic.

At this point, the web seems a bit more dynamic, yet it still is very much read only. In order to publish content, it still requires a degree of technical proficiency. Early bloggers had to have knowledge of HTML and File Transfer Protocol (FTP). A blog entry was manually added into the code and uploaded onto the server.

It wasn’t until 2004 that blogging was considered mainstream. The event that signaled the arrival of blogging was a story about Trent Lott, a US Senator from Mississippi, praising Senator Strom Thurmond. At a party, Senator Lott suggested that the United States would have been better off had Thurmond been elected president in 1948. Critics of Lott’s comments saw these comments as a tacit approval of racial segregation, as Thurmond advocated for it in 1948 during his presidential campaign. Bloggers dug up documents and recorded interviews that reinforced Lott’s racism.

Though Lott's comments were made at a public event attended by the media, no major media organizations reported on his controversial comments until after blogs broke the story. Blogging helped to create a political crisis that forced Lott to step down as majority leader of the US Senate.

After the Lott incident, political consultants, news services and candidates began using blogs for outreach and opinion forming. In 2005, Fortune magazine listed 8 bloggers whom business people couldn’t ignore: Peter Rojas, Xeni Jardin, Ben Trott, Mena Trott, Jonathan Schwartz, Jason Goldman, Robert Scoble, and Jason Calacanis.

The term Web 2.0 was coined by Darcy DiNucci in 1999. It was popularized by Tim O’Reilly and Dale Dougherty at the O’Reilly Media Web 2.0 Conference in 2004. The label is meant to declare the next generation of internet based services. The range of what was considered next generation is broad and vague. But in general it focuses on social networking sites, wikis, communication tools and folksonomies.

Broadly speaking, Web 2.0 is when the web became dynamic. The primary use of a Web 2.0 website wasn’t the retrieval of information but the collaborative production of it. A social network or a wiki is nothing without users generating content.

Web 2.0 sites allow the audience to interact with content. People can comment on blog posts, add and edit entries on a wiki, tag posts, and upload media. Every person connected to a site is able to broadcast and receive text, images, audio, video, software, data, links, etc. Participatory media derives value from the active participation of many people. StumbleUpon was an example of this participation in action. As a discovery and advertising engine, it pushed recommendations of content to users through peer-sourcing.

Another aspect of Web 2.0 is folksonomies. This is a system in which users apply public tags to online items. It’s also known as collaborative tagging, social classification, social indexing and social tagging. Tagging is easy to understand and do.It’s very flexible and it directly reflects the user’s vocabulary. As a developer, I might not think to add dank as a way to classify memes, but I’m sure that some users would.

Hashtags are also a folksonomy. These are especially powerful in the creation and discovery of communities. Of course, they’re flexibility can also be detrimental. For example, a misspelled hashtag might not be seen by the very people that the poster was trying to include in a conversation.

Information that is amplified enables broader, faster, and lower cost coordination of activities. What was once one to many, like broadcast media, has transformed into conversations among people formerly known as the audience. Conversations are open-ended and assume equality.
This is also when the mix of reduced barriers to entry for blogging and the attention of mainstream media stirs up emotions of a new era for democracy and a lot of lawsuits.

By 2008, a new blog was created every second. And up until 2009, most blogs were the work of one individual. When it comes to what’s legal and illegal, it’s been mixed. Internet Service Providers in general have been granted immunity for liability for information that originates with third parties. So if AT&T is my internet service provider and I create a blog on Blogger.com where I post horrible lies about my partner, AT&T can’t be sued for my actions.

And what about anonymity? What if I posted horrible lies about my partner but did so anonymously? In Doe v. Cahill the Delaware Supreme Court held that stringent standards had to be met to unmask anonymous bloggers. In general, attempts to stay anonymous haven’t worked out that well. Even if the courts don’t unmask an anonymous blogger, the bloggers usually get unmasked by someone else. And then there are social networks. If I post horrible lies about my partner on Facebook, can Facebook be held liable?

This is the current debate. To what extent are platforms responsible for reprehensible behavior by their users? In the beginning, platforms said that it’s not their job to police user behavior. It’s a free speech issue. Today? It looks like some platforms are slowly evolving.

What’s clear is that people will post a wide range of horrible things. Sometimes as a whistleblower, sometimes as a troll. And in the absence of regulation, the target of an attack has to figure out how to deal with it on their own. Hopefully, the approach of shoving your head in the sand and trying to scream a louder narrative will eventually fade away.

So what’s cutting edge today? What’s being labeled as Web 3.0? Some refer to Web 3.0 as the Semantic Web. A semantic web allows the data in web pages to be structured and tagged in a way that it can be directly read by computers.

In the first versions of HTML, a section was referred to as a div. In a given webpage, a menu would be a div, the content of the article would be a div, the ads would have their own divs, and so on. With HTML5, a variety of new elements have been added to give more clarity. You can still use the div element, but now you can also define sections, menus, headers, footers, articles and more.

Microformats have been developed to work in conjunction with the semantic web, standardizing formatting of various things. For example, the microformat hCalendar adds summary, start date, end date, location and a URL within a formatted HTML page. It doesn’t look any different to the human looking at the webpage, but it makes a world of difference to the machine that is trying to parse it.

Another focus of Web 3.0 is natural language search, or searching how you speak. If you wanted to find good Chinese food, how would you phrase that to a search engine? I’ve learned to think like a search engine, so I usually type in something like “best chinese restaurants miami.” With a natural language search engine, you would type “Where can I eat the best chinese food nearby?”

The last umbrella term for Web 3.0 is data mining. It’s the practice of using large databases in order to generate new information. For example, Amazon uses their database of your searches and habits to figure out what you would like to buy next. The same principles can also be found with chatbots known as Recommendation Agents. These agents elicit the interests or preferences of an individual for products as you talk to them, and then they make recommendations.

The last is a tale of two versions of Artificial Intelligence, or AI. One is dumb and just tries to make life easier by automating tasks. The other wants you to believe it’s sentient. And while Artificial Intelligence isn’t specifically categorized under data mining, it is data hungry. All the data that we create gets fed into software that tries to make sense out of it.

So if I looked at your data trail, how might I make sense of you?