I always appreciate these personal approaches and descriptions. It's good to know that you've been around the block several times, and you've settled on these particular tools. I myself have centered most of my research around ChatGPT Plus, and as a bonus, I get really good image generation. I use Gemini (or Google's Experimental Model) to read and "grade" my work - ChatGPT is still better at reviewing writing, but Gemini will notice some errors ChatGPT will not.
Perplexity has become a favorite too. It's amazing for quick research, probably better than the other 2 I mentioned. I've used the paid versions of GPT and Gemini, but only the free Perplexity model, and it is nearly as good as the paid models for my needs.
I might need a crash course on Perplexity one of these days. You're one of many people on Substack who seem to use it regularly, but while I've tried it out multiple times, I could never get it to "stick" in my routine.
Same. I have now started using it, especially the "Pro" mode for deeper, slightly more complex searchers. The "Pro" mode appears to be somewhere between traditional AI search and the "Gemini Deep Research" you mentioned in a separate comment. It makes an action/research plan and follows it, although it doesn't give you the option to edit it and doesn't crawl nearly as many pages.
For me, if I already have an idea about something but want to verify info, it's very fast and very effective, and very transparent insofar as where the info is coming from.
I do note that what we're using these tools for very much determines which tool is the best to use. I'm not sure that was the case a year ago.
Agreed. Perhaps the reason I never got "into" Perplexity is that my current writing and research routine isn't a good fit for it. I mean, it's clearly a solid product, but I just don't find myself naturally navigating to it.
Count me in as someone who's well aware of Perplexity but isn't using it actively. I remember trying it way back in early 2023 and even showcasing it as the future of search to some friends. I also know many people here on Substack who absolutely swear by Perplexity.
I've made several attempts at incorporating Perplexity into my daily life, going as far as following specific instructions on YouTube about setting up special browser extensions for it and using it as my default search engine. At the time, I went back to Google because I didn't like how Perplexity handled navigational queries - it'd take me to a Perplexity answer for a company instead of just opening their website when I'd type the name into the search bar.
So, while I recognize that Perplexity is a very useful tool for many people, it'd feel disingenuous to put it on a list of my personal used tools. I briefly considered putting it on the list (under "Research," exactly as you mentioned) but then I'd have to add the disclaimer that I don't use it, which might've given the impression that I don't approve of it. But the simple story is that it's just not in my daily repertoire. Perhaps it will be at some later stage.
It's a pity that Claude does not support web search at the moment. In terms of language and programming, it is, in my opinion, the best. I am also considering using Perplexity since it gives me the option to choose my model, even within the same chat. Grok has improved significantly in image creation as well, but I still prefer MidJourney.
Agreed. I think the lack of web access is one of the main "notches" against Claude right now. That, and the very limited bandwidth/rate limits for free accounts. I did try Grok's own "Aurora" image model and also found it pretty great. Midjourney is slowly losing ground to some newcomers, but on the other hand, they haven't had a new version out since late 2023 (!). I know they're working on V7 and I can't wait to see what that looks like!
My first hands-on experience with AI was through Canva. I started creating images for my templates. At the beginning they were weird but the improvement was increasing rapidly. Then I moved to ChatGPT (paid version) and now am using Adapta. I use ChatGPT to access Dall_e to create images; as I need to create content for my students, I upload images (photos) and ask AI to describe the scene. I gather both, the description (text) and and the image I created (JPG) and upload them with Canva to create a new template for a class. Those are the ones I use the most.
I encourage you to try other image models with the same prompts. DALL-E 3 is pretty good at instruction following, but I think it's falling a tiny bit behind in image quality. So try giving Ideogram or FLUX a shot!
I wrote about it and strongly considered pulling the trigger on Gemini Advanced free trial, but haven't gotten around to it yet. It looks really solid though!
Agreed. I feel we're in the same spot with video AI right now as we were with text-to-image models in late 2022 when I was starting the newsletter. Midjourney and Stable Diffusion were just beginning to make images that looked realistic and polished. Then Midjourney V4 came, etc. So I expect further leaps in video over the coming months for sure.
I continue to pay for an OpenAI / ChatGPT Plus account for most of the same reasons as you, and also the "memory" feature which I really like. I use Gemini and Pi for free on a Pixel phone. I pay for Perplexity pro, and Claude, and also Poe, which gives me access to the latest versions of pretty much all the top vendors' models (other than Perplexity) for chatbots, image generation and more. At some point soon I need trim down to Poe and ChatGPT and Perplexity.
Yeah sounds like you've got lots of overlapping products covering similar needs at this point. I'd be curious to hear more about Perplexity Pro. I'm personally not a huge Perplexity user in general, but many people I know who love Perplexity seem to be quite happy with just the "free" version. Would be interesting to know which of the Pro features justify paying for it as far as you're concerned.
This is good; so much noise good to get a signal I trust. Ive settled into CoPilot as it’s easy and natural for me to stay within the MSFT ecosystem but they just updated it for speech and I don’t like it so much. Image generation is more difficult and speech is laggy on my old phone and it tries so hard to be friendly and sustain engagement. Much cringe. No more sliders to control tone or tweak images. Think I’ll move over to ChatGPT
I think both Microsoft and Google (and Apple when they finally get their shit together) have massive advantages in terms of the existing ecosystem when it comes to rolling out AI. People who are already using their suite of products will naturally prefer an AI model that can integrate natively with those, including their existing documents, files, etc.
Especially since the current frontier models are converging in terms of performance, so you might as well stick to the one that works with your existing apps and tools.
Early days I was really hoping this round would disrupt some of my not-so-favorite behemoths like Google, but now it looks like you’re right and they all are following an evolutionary path integrating AI into existing products. This round has created a new monster in OpenAI. I’m still waiting for the killer apps from some startups; I know they’re coming
Yup. These are still very early days. Lots of rapid shipping but the true value will eventually come with widespread integration of those into useful products.
Though heavy censorship sometimes, Google image gen 3( aka ImageFX ) is actually great, especially for complex context understanding and presenting, which I think is better than dalle 3 and ideo. And it’s free!
As for video, Hailuo AI is indeed really good. And thank you for the Sunday post introduction.
I agree, Google Imagen 3 is excellent, too. I just have my needs sufficiently covered by the three tools I've mentioned to justify another regular image model. I do check in on it here and there, along with many other image models just to keep tabs on the developments. (Google also recently finally rolled it out to Gemini instead of only having it available in research preview in the AI Test Kitchen where I've been using it prior to that.)
I'll have to try how it handles complex prompts, now that you mention it's better than Ideogram 2.0 at that - Ideogram is now my gold standard for prompt-following.
As for Hailuo AI, I honestly was floored by just how great it was at visual consistency and realistic movements when animating from a still image. The best I've seen thus far, which earned it a "God Tier" in my tests: https://www.whytryai.com/p/free-ai-image-to-video-tools-tested.
Not sure I follow you about the Sunday post, but I guess you're welcome ;)
By the way, I forgot to mention google image gen 3's "edit image", which is also a very useful feature and easy to use. Just choose the fitting brush, select the area. Then type "fix it" in the prompt box, it will magically do the job most of the time.
Ah, so it's their version of inpainting? Pretty cool. You have it with Midjourney and Adobe's products, too. I'll revisit Imagen 3 again to see how it holds up!
Good article Dan, thanks. It's helpful to have your greatest hits all in one place.
I know I'm obsessed with this far more than most people, but one of the features of ChatGPT I like is the simplicity of the interface. Other AI companies would be wise to learn from this example.
That said, ChatGPT has recently lost the ability to display scroll bars in Chrome in my use, thus completely destroying the entire service in that browser. Other AI companies would be wise to learn from this too.
I hear you regarding your disdain for ChatGPT text, and will have to follow your example in exploring Claude. Thanks for the nudge.
I think you'll enjoy Claude. Similarly simple interface with the nice Artifacts window where it outputs content you're working on. You just get a very limited number of free messages (around 6-7 per 6 hours or so), so you gotta be economical with them.
Also, I'm probably just jaded when it comes to ChatGPT's writing simply because I've used it the longest and it reads "stale."
Unlock the power of AI with STORM and NotebookLM! Turn written content into engaging podcast-style audio summaries in just a few clicks. Perfect for teachers, students, and anyone looking to supercharge their workflow, STORM organizes your research while NotebookLM transforms it into audio you can listen to on the go. https://peterpappas.com/2024/09/ai-magic-create-podcast-style-summaries-with-storm-and-notebooklm.html
I always appreciate these personal approaches and descriptions. It's good to know that you've been around the block several times, and you've settled on these particular tools. I myself have centered most of my research around ChatGPT Plus, and as a bonus, I get really good image generation. I use Gemini (or Google's Experimental Model) to read and "grade" my work - ChatGPT is still better at reviewing writing, but Gemini will notice some errors ChatGPT will not.
Perplexity has become a favorite too. It's amazing for quick research, probably better than the other 2 I mentioned. I've used the paid versions of GPT and Gemini, but only the free Perplexity model, and it is nearly as good as the paid models for my needs.
I might need a crash course on Perplexity one of these days. You're one of many people on Substack who seem to use it regularly, but while I've tried it out multiple times, I could never get it to "stick" in my routine.
For me it helps to think of Perplexity as a research tool, not a colleague that I’m chatting with. (Maybe this only makes sense to me though!)
Same. I have now started using it, especially the "Pro" mode for deeper, slightly more complex searchers. The "Pro" mode appears to be somewhere between traditional AI search and the "Gemini Deep Research" you mentioned in a separate comment. It makes an action/research plan and follows it, although it doesn't give you the option to edit it and doesn't crawl nearly as many pages.
For me, if I already have an idea about something but want to verify info, it's very fast and very effective, and very transparent insofar as where the info is coming from.
I do note that what we're using these tools for very much determines which tool is the best to use. I'm not sure that was the case a year ago.
Agreed. Perhaps the reason I never got "into" Perplexity is that my current writing and research routine isn't a good fit for it. I mean, it's clearly a solid product, but I just don't find myself naturally navigating to it.
Yep. Same reason I don't use Midjourney every day.
Did you miss to mention perplexitiy.ai?
I was expecting that to show up under research.
At this point, I'd be curious to hear from anyone who knows about perplexity and is _not_ using it. I guess that is not you, but just double-checking.
Hey Nico. Nope, haven't missed it.
Count me in as someone who's well aware of Perplexity but isn't using it actively. I remember trying it way back in early 2023 and even showcasing it as the future of search to some friends. I also know many people here on Substack who absolutely swear by Perplexity.
I've made several attempts at incorporating Perplexity into my daily life, going as far as following specific instructions on YouTube about setting up special browser extensions for it and using it as my default search engine. At the time, I went back to Google because I didn't like how Perplexity handled navigational queries - it'd take me to a Perplexity answer for a company instead of just opening their website when I'd type the name into the search bar.
So, while I recognize that Perplexity is a very useful tool for many people, it'd feel disingenuous to put it on a list of my personal used tools. I briefly considered putting it on the list (under "Research," exactly as you mentioned) but then I'd have to add the disclaimer that I don't use it, which might've given the impression that I don't approve of it. But the simple story is that it's just not in my daily repertoire. Perhaps it will be at some later stage.
Interesting
It's a pity that Claude does not support web search at the moment. In terms of language and programming, it is, in my opinion, the best. I am also considering using Perplexity since it gives me the option to choose my model, even within the same chat. Grok has improved significantly in image creation as well, but I still prefer MidJourney.
Agreed. I think the lack of web access is one of the main "notches" against Claude right now. That, and the very limited bandwidth/rate limits for free accounts. I did try Grok's own "Aurora" image model and also found it pretty great. Midjourney is slowly losing ground to some newcomers, but on the other hand, they haven't had a new version out since late 2023 (!). I know they're working on V7 and I can't wait to see what that looks like!
My first hands-on experience with AI was through Canva. I started creating images for my templates. At the beginning they were weird but the improvement was increasing rapidly. Then I moved to ChatGPT (paid version) and now am using Adapta. I use ChatGPT to access Dall_e to create images; as I need to create content for my students, I upload images (photos) and ask AI to describe the scene. I gather both, the description (text) and and the image I created (JPG) and upload them with Canva to create a new template for a class. Those are the ones I use the most.
That's a very solid workflow, Anna!
It's actually something I talked about early last year: Using chatbots to help you describe and make better prompts for image models: https://www.whytryai.com/p/ai-images-chatbots
I encourage you to try other image models with the same prompts. DALL-E 3 is pretty good at instruction following, but I think it's falling a tiny bit behind in image quality. So try giving Ideogram or FLUX a shot!
Have you check out Gemini’s Deep Research yet? It’s pretty amazing. https://open.substack.com/pub/aigoestocollege/p/gemini-deep-research-a-true-game
I wrote about it and strongly considered pulling the trigger on Gemini Advanced free trial, but haven't gotten around to it yet. It looks really solid though!
Recently, this open-source model came on my radar that does something vaguely similar: https://github.com/InternLM/MindSearch/blob/main/README.md
It's definitely on my list to take them for a spin!
Ideogram is a game changer if you need text in images.
Yup, it's top-tier in my latest test from November 2024: https://www.whytryai.com/p/ai-image-model-spelling-text
But don't sleep on FLUX and Recraft if spelling is your jam!
I’ll have to go through these :)
Have fun!
Great selection of tools thanks.
You got it, Curtis!
Interesting mentions of the text to video options which I expect to dramatically improve in 2025.
Agreed. I feel we're in the same spot with video AI right now as we were with text-to-image models in late 2022 when I was starting the newsletter. Midjourney and Stable Diffusion were just beginning to make images that looked realistic and polished. Then Midjourney V4 came, etc. So I expect further leaps in video over the coming months for sure.
I continue to pay for an OpenAI / ChatGPT Plus account for most of the same reasons as you, and also the "memory" feature which I really like. I use Gemini and Pi for free on a Pixel phone. I pay for Perplexity pro, and Claude, and also Poe, which gives me access to the latest versions of pretty much all the top vendors' models (other than Perplexity) for chatbots, image generation and more. At some point soon I need trim down to Poe and ChatGPT and Perplexity.
Yeah sounds like you've got lots of overlapping products covering similar needs at this point. I'd be curious to hear more about Perplexity Pro. I'm personally not a huge Perplexity user in general, but many people I know who love Perplexity seem to be quite happy with just the "free" version. Would be interesting to know which of the Pro features justify paying for it as far as you're concerned.
This is good; so much noise good to get a signal I trust. Ive settled into CoPilot as it’s easy and natural for me to stay within the MSFT ecosystem but they just updated it for speech and I don’t like it so much. Image generation is more difficult and speech is laggy on my old phone and it tries so hard to be friendly and sustain engagement. Much cringe. No more sliders to control tone or tweak images. Think I’ll move over to ChatGPT
I think both Microsoft and Google (and Apple when they finally get their shit together) have massive advantages in terms of the existing ecosystem when it comes to rolling out AI. People who are already using their suite of products will naturally prefer an AI model that can integrate natively with those, including their existing documents, files, etc.
Especially since the current frontier models are converging in terms of performance, so you might as well stick to the one that works with your existing apps and tools.
Early days I was really hoping this round would disrupt some of my not-so-favorite behemoths like Google, but now it looks like you’re right and they all are following an evolutionary path integrating AI into existing products. This round has created a new monster in OpenAI. I’m still waiting for the killer apps from some startups; I know they’re coming
Yup. These are still very early days. Lots of rapid shipping but the true value will eventually come with widespread integration of those into useful products.
Though heavy censorship sometimes, Google image gen 3( aka ImageFX ) is actually great, especially for complex context understanding and presenting, which I think is better than dalle 3 and ideo. And it’s free!
As for video, Hailuo AI is indeed really good. And thank you for the Sunday post introduction.
I agree, Google Imagen 3 is excellent, too. I just have my needs sufficiently covered by the three tools I've mentioned to justify another regular image model. I do check in on it here and there, along with many other image models just to keep tabs on the developments. (Google also recently finally rolled it out to Gemini instead of only having it available in research preview in the AI Test Kitchen where I've been using it prior to that.)
I'll have to try how it handles complex prompts, now that you mention it's better than Ideogram 2.0 at that - Ideogram is now my gold standard for prompt-following.
As for Hailuo AI, I honestly was floored by just how great it was at visual consistency and realistic movements when animating from a still image. The best I've seen thus far, which earned it a "God Tier" in my tests: https://www.whytryai.com/p/free-ai-image-to-video-tools-tested.
Not sure I follow you about the Sunday post, but I guess you're welcome ;)
By the way, I forgot to mention google image gen 3's "edit image", which is also a very useful feature and easy to use. Just choose the fitting brush, select the area. Then type "fix it" in the prompt box, it will magically do the job most of the time.
Ah, so it's their version of inpainting? Pretty cool. You have it with Midjourney and Adobe's products, too. I'll revisit Imagen 3 again to see how it holds up!
Good article Dan, thanks. It's helpful to have your greatest hits all in one place.
I know I'm obsessed with this far more than most people, but one of the features of ChatGPT I like is the simplicity of the interface. Other AI companies would be wise to learn from this example.
That said, ChatGPT has recently lost the ability to display scroll bars in Chrome in my use, thus completely destroying the entire service in that browser. Other AI companies would be wise to learn from this too.
I hear you regarding your disdain for ChatGPT text, and will have to follow your example in exploring Claude. Thanks for the nudge.
I think you'll enjoy Claude. Similarly simple interface with the nice Artifacts window where it outputs content you're working on. You just get a very limited number of free messages (around 6-7 per 6 hours or so), so you gotta be economical with them.
Also, I'm probably just jaded when it comes to ChatGPT's writing simply because I've used it the longest and it reads "stale."
Love it Daniel. I’d forgotten about Gemini’s long context window. Thanks for the tip!
You got it. Gemini's great for parsing even stuff like e.g. longer videos and extracting details from those. Worth a try!
Unlock the power of AI with STORM and NotebookLM! Turn written content into engaging podcast-style audio summaries in just a few clicks. Perfect for teachers, students, and anyone looking to supercharge their workflow, STORM organizes your research while NotebookLM transforms it into audio you can listen to on the go. https://peterpappas.com/2024/09/ai-magic-create-podcast-style-summaries-with-storm-and-notebooklm.html