Text-to-image: New improved AI images

| February 12, 2024

We are happy to announce the latest improvement in our AI-powered image generation technology.

Our text-to-image generator is now making a million images per day (in January, it was 26.5M images), which makes it one of the most popular Freepik products launched last year. We love that you’re loving it, and we are bringing some good news. The tech solutions we use for AI image generation were updated, and the generator has now been improved.

What does it mean in practice?

Simply, everything is better

The quality of the images generated with our text-to-image tool has been massively upgraded. Basically, every aspect of an image you can think of is now better. Because of the update, all details are enhanced, and photos are now looking more realistic than before. Let’s look through some before and after examples for the same prompts to see the differences.

You can notice all the aspects mentioned below have visibly changed in all the images. On the left, we have the “Before” images generated with the “old” text-to-image generator. On the right, labeled with “After”, are images made after the update. The images were generated with identical prompts.

Textures

Textures on objects and surfaces are also closer to real life, adding depth to the final result. Skin pores, wood lines, or animal fur all look truer now.

Notice not only the skin and hair textures but also the lighting in these images, too. If you look really close, you can see an actual reflection on the eye on the right.

“Photography of a brown female eye”

Yes, they both sure are very good boys. Though, we must point out the better lighting, softer colors, and detail in the fur structure in the image on the right.

“A beautiful and fluffy border collie”

Features of people

The lifelikeness of people in the images is now more refined and realistic. Details like faces, limbs, or hands are closer to reality than before. The AI can sometimes still add a “gummy finger,” but the chance of that is far less likely.

The difference between these two doesn’t even need commentary. Everything from lighting to human characteristics and the environment looks more natural. The final result on the right takes the vintage and bohemian style from the prompt, whereas the before image seems to ignore it.

“Portrait supermodel with wild flowers in the mountains bohemian sunlight minolta vintage”

The pinkish shade of white and the clouds give an unrealistic feeling to the image. The sunlight in the right image looks natural, as if it’s coming from the side through a window with realistic yellow tones.

“A cinematic still frame of lana del rey barely awake light streaks dust in the air indoor photography”

The shadows on the model’s face look more realistic, and so does the skin. In the before image, the skin is too smooth and overly shiny, with a blueish undertone, which makes it look more like a plastic doll than an actual human.

“A woman with a futuristic binary dot matrix over her face metallic minimalist glitch effect”

Even a forest inside a human now looks more realistic. Look.

“Create a green surreal double exposure photo of a silhouette and a forest. Nothing outside the image. The background should be minimalistic”

Colors and lighting

Proper lighting is crucial for realistic images. The direction, intensity, and color of light sources are now creating shadows and highlights with better results. It contributes to a nicely balanced and harmonious color palette, which creates a better overall image aesthetic.

The colors on the left resemble a futuristic movie. Or at least they are heavily edited. The picture on the right is more in the “I snapped this on my way to work” vibe.

“Blank empty billboard mockup at the bus stop in the middle of New York street”

We love the cyberpunk style, and with the new generations, so will you. Look how nicely the neon lights work in the dark environment.

“A tiny robot in the streets of a cyberpunk city volumetric light detailed octane render”

The before image here looks like it was taken in a studio in front of a green screen. The mug looks plastic-ish, and there are way too many light sources. If we didn’t know any better, we’d actually be jealous of someone having coffee during a sunrise like in the after image.

“A warm and cozy coffee mug at sunrise”

Better illustrations

Like photos, illustrations also contain more details and nicer colors. The new results are, therefore, higher quality, with more depth, character, and balance. The examples speak for themselves.

The girlies on the right definitely look happier if you ask us. The illustration nicely shows the light and shadows from the fire.

“Poster in a tender animation style of a group of three girls happily holding a bonfire by a river in a park in the evening delicate face lawn tent street lights picnic cartoon illustration”

We’re back at more details and better colors, but also–we love the composition on the after image more.

“Postmodernist collage about futuristic technology and artificial intelligence historical illustrations modern magazine cutouts newspaper”

Notice the shadows that don’t really follow the legs on the before image?

“An illustration of a man staring at the stars, saturated turquoise colors, by Chelsey Bonestell, by syd mead”

We could go on and on with the examples, but we’d prefer if you tried it yourself. Go generate a bunch of beauties with the improved generator now.

Generate with text to image

Kudos to the HuggingFace team

At Freepik, we use a very well-known open-source library from HuggingFace called diffusers as part of the creation process for image generation. We realized that the Huggingface library was producing images with defects. We informed their team in great detail, and YiYi Xu took over and identified the error. Then, they took the lead in creating a new version of the library with the solution. We would love to thank YiYi Xu and the HuggingFace team for leading the way for open-source AI.

By Bianka