One year of ChatGPT

It has been exactly 365 days since OpenAI casually dropped ChatGPT. In light of this anniversary, I wanted to reflect on what the last year has been like but also share my initial feelings when the release happened.

Here is the tweet that started it all:

Try talking with ChatGPT, our new AI system which is optimized for dialogue. Your feedback will help us improve it. https://t.co/sHDm57g3Kr
— OpenAI (@OpenAI) November 30, 2022

The tweet, alongside the official blog post, triggered a timeline that I don’t think anyone was ever prepared for or knew that it was coming. Even OpenAI tried to downplay the initial success, saying that they didn’t expect so many people to like the chatbot or that they foresaw that usage numbers would skyrocket. And skyrocket, they did.

By January 2023, a measly month after its launch – reports were coming in that ChatGPT had surpassed 100 million active monthly users.

By comparison, it took TikTok nine months to reach that, and Instagram more than two years with some change. As of November 2023, ChatGPT has 100 million active weekly users, as Sam Altman confirmed during the first OpenAI DevDay conference.

So, why was ChatGPT such a hit, and how come previous iterations of their GPT model, such as GPT-3, released in 2020, did not amass such success?

This was largely due to RLHF (Reinforcement learning from human feedback), which in the simplest of terms, translates to – real human beings monitoring what the responses of the model were and assigning those responses as either “good” or “bad”.

RLHF is used in tasks where it’s difficult to define a clear, algorithmic solution but where humans can easily judge the quality of the model’s output. For example, if the task is to generate a compelling story, humans can rate different AI-generated stories on their quality, and the model can use their feedback to improve its story generation skills.

By now, everyone knows that ChatGPT is eerily confident at anything and everything, which is also its main weakness—being so confident and convincing disregards the fact that these chatbots hallucinate (make stuff up) far more regularly than what might be acceptable for a true AI workflow.

Note: The above is an oversimplification. For example, GPT-3 was not released as an AI built for conversation but as a raw model. You could always use the OpenAI Playground to play with different models, with the difference being that ChatGPT was specifically built for conversations.

My initial thoughts and feelings at launch

On the first day of playing with ChatGPT, I immediately knew things would change, especially for me as a writer. That realization was so stark that I experienced something I can only describe as “background depression”, and I do believe that many others experienced something similar.

ChatGPT is very capable, particularly in creative writing and incremental coding tasks, but also for taking in submitted information and working with that. Let’s take the most basic example of this, which is to throw in a bunch of disorganized data into ChatGPT and ask it to convert that to a table:

It might not seem like much, right?

But this is such an incredible productivity boost that I am certain someone would have paid $5 if not $10 per month, just for this functionality alone. Then you can do things like copy the raw HTML structure of a page and ask ChatGPT to create a Python script to scrape that website and the data you need and then convert that data into any format you like.

It’s small things like this that, one after another, add up not only in the productivity department but also in the overall perception of what is possible. This dawned on me on day one, so even though I could see the immense productivity benefits – I could also see how this talking robot, one day, would inevitably place a writer like myself out of work.

Out of work?!

I am not saying it will happen overnight, but I think it will happen over the next decade, maybe even sooner, depending on what OpenAI decides to do. Naturally, ChatGPT tone is very predictable and requires a lot of manual involvement if you use it for the purpose of writing. Still, a lot of these problems can already be solved with fine-tuning, and then you have to consider that the models will only keep getting better over time.

The copyright dilemma

We can’t talk about (least of all celebrate) ChatGPT without talking about copyright. Ironically, no matter what I say – it seems that the last nail has already been put in the coffin, with OpenAI not willing to budge on its fair use stance, despite numerous lawsuits from authors.

OpenAI has said the books are used only to spur innovation, not to create new works, and that that practice is lawful under the “fair use” provision of copyright law.
Christina Pazzanese, The Harvard Gazette

The only reason ChatGPT is even remotely as successful as it is is precisely because it was able to ingest a lot of copyrighted works. If it didn’t have access to GitHub data (thanks to the Microsoft partnership) or data from sites like Stack Overflow, Reddit, and other Q&A sites, it would hardly be as knowledgeable or even interesting to use as it is now.

Stack Overflow, at one point, disabled its regular data dump to “assess” how it can deal with platforms like OpenAI just taking that data for themselves, and Reddit went through an entire exodus with its API changes which the CEO said was because LLM’s are getting too greedy.

Then, in August 2023, OpenAI introduced GPTBot:

Web pages crawled with the GPTBot user agent may potentially be used to improve future models and are filtered to remove sources that require paywall access, are known to primarily aggregate personally identifiable information (PII), or have text that violates our policies. Allowing GPTBot to access your site can help AI models become more accurate and improve their general capabilities and safety.

The idea is that you can opt out of having your site’s data used by OpenAI to crawl and use that data for future models. When I did my initial analysis, many sites chose to opt-out, including big names like The Verge, Quora, Amazon, CNN, Reuters, etc. Ben Walsh, who maintains the palewire project, also tracks how many news sites block GPTBot and other similar bots, with the current number being 581 out of 1,153 sites.

And on top of that, out of the ~200 articles that I have published this year, my most popular ones have been specifically about AI training:

Zoom’s updated Terms of Service permit training AI on user content without Opt-Out (800,000+ views)
The shady world of Brave selling copyrighted data for AI training (200,000+ views)
X/Twitter has updated its Terms of Service to let it use Posts for AI training (100,000+ views)

And yet, none of this really amounts to anything because it feels like this war was lost even before it began. OpenAI will move on and enjoy a succulent $90 billion valuation that they’ll have the option to grow incrementally as they dripfeed future releases of the GPT model.

I have come to accept that as the reality.

How do I use ChatGPT daily?

The four areas where I use it the most are:

Summarizing. I’ll throw it in the blender to extract the key points for any paper that isn’t immediately clear on its premise.
Scripting. Being able to quickly extract various data from websites with the help of a JavaScript snippet that I don’t need to write myself is super helpful and productive. The same goes for Python scripts that I can quickly whip up to automate mundane tasks.
Linux workflow. My entire digital workflow is within a Linux shell environment. So creating complex awk queries or Bash scripts for monitoring, data extraction, etc., has been great. And I might add that I have had very few occurrences where the commands given have been wrong or not working.
Images. As of recently, with DALLE being integrated into ChatGPT, I now use it to generate featured images for my articles, which is excellent. I don’t mind using Unsplash or other CC sites, and I still use them, but I’ve always liked vector designs.

At the start of the year, the big problem was the context window because ChatGPT only supported either 4k (v3.5) or 8k (v4), which meant that if you asked it to generate code boilerplates or wanted to have a long conversation, you couldn’t do that.

That seems to be changing now, though, with the default GPT-4 model in the UI using a 32k context window, and hopefully, that will soon get bumped to 128k as per GPT-4-Turbo that OpenAI has already announced.

And then there are GPTs (which I might add I have not had a chance to dive into yet truly) that can be likened to mini fine-tuned versions of ChatGPT for a specific task or tasks. They even support uploading custom data (such as a PDF or a long document), which means you can tackle the hallucination problem to an extent, especially if you want to hyper-optimize for a specific task.

But more than that, it’s yet another significant productivity boost that slowly but surely is getting people accustomed to using ChatGPT instead of doing Google searches for everything.

Anecdotally, every regular person who does not work in tech I have spoken to this year has no idea what ChatGPT is. For some of them, I had to help install it on their phone so that they would understand what I was talking about.

The timeline of releases and updates

Last but not least, here is the “One year of ChatGPT” as it unfolded from OpenAI’s shipping and product updates perspective:

November 30, 2022: Launch of ChatGPT in research preview by OpenAI.
December 15, 2022: General performance enhancements and new features for managing conversation history.
January 9, 2023: Improvements in factuality; feature to halt response generation mid-conversation.
January 30, 2023: Further upgrades for enhanced factuality and mathematical capabilities.
February 9, 2023: Introduction of ChatGPT Plus with new features and a faster ‘Turbo’ version.
February 13, 2023: Updates to the free plan’s performance and international availability of ChatGPT Plus.
March 14, 2023: Integration of GPT-4 with advanced reasoning and creativity.
March 23, 2023: Experimental AI plugins, including browsing and Code Interpreter capabilities, for selected users.
May 3, 2023: Ability to turn off chat history and export data.
May 12, 2023: Early access to experimental web browsing and third-party plugins for Plus users.
May 24, 2023: iOS app expansion with new features.
June & July 2023: Mobile app updates; introduction of Code Interpreter in beta; increased message limits for GPT-4.
July 25, 2023: Android version of the ChatGPT app launched.
August 2023: User experience enhancements; custom instructions extended to free users.
August 28, 2023: Launch of ChatGPT Enterprise.
September 11, 2023: Limited language support in the web interface.
September 25, 2023: Beta introduction of voice and image input capabilities.
September 27, 2023: Updated version of web browsing for Plus users.
October 16, 2023: Integration of DALL·E 3 in beta for image generation from text prompts.
October 17, 2023: Browsing feature moved out of beta for Plus and Enterprise users.
November 6, 2023: Introduction of customizable versions of ChatGPT, called GPTs, for specific tasks.
November 21, 2023: Voice feature made available to all users.

Thanks for reading!

One year of ChatGPT

My initial thoughts and feelings at launch

The copyright dilemma

How do I use ChatGPT daily?

The timeline of releases and updates

Read more

Alibaba’s new tech animates photos into realistic videos

ChatGPT is now flagging the divergence attack as a violation

NSA comes up with best practices for rolling out secure AI systems

VASA: Microsoft AI generates “talking face” from photo and voice recording

9 AI Tools for Creating Videos

9 AI Avatar & Profile Picture Generators