Neural Network Used For Generating HQ Images Based Just On Basic Text Descriptions

AI research is moving forward with each day. Although we still aren’t able to construct a proper AI that would incorporate all features found in different AI projects, the progress we humans have made with Artificial Intelligence in the last couple of years is astounding. We already have access to a bunch of AI assistants such as Google Assistant, Apple’s Siri, Microsoft’s Cortana, and Amazon Alexa.

Aside from providing users with handy tools able to make everyday life easier, AI research also deals with more complex (don’t think that AI assistants aren’t complex), more advanced, more creative tasks. For example, Google Deep Mind AI will learn how to play Starcraft II, a complex real-time strategy game in which every decision counts, in which AI could learn about the numerous tactics as well as about improvisation and fast decision making. Researchers from the MIT managed to make a system that’s able to predict following events from still images, even being able to generate videos based on predictions.

Another interesting piece of AI research was done by scientists at Rutgers University, the University of North Carolina at Charlotte, Lehigh University, and the Chinese University of Hong Kong. The research used basic text descriptions, which were given to a neural network, and the network managed to generate high-quality images of objects with just the information available from text descriptions.

Han Zhang, one of the researchers who worked on the project explained that “Generating realistic images from text descriptions has many applications. Previous approaches have difficulty in generating high-resolution images, and their synthesized images in many cases lack details and vivid object parts. Our StackGAN for the first time generates 256 x 256 images with photo-realistic details.”

StackGAN first draws a low-res image that only reminds of the description given to it, and then fills the image with details, adding higher resolution, small details, and sharper form to images until they finally become a perfect mirror of the textual description.

Instead of making a neural network to learn about existing objects, or making it recognize particular object groups like faces or cars, the research trained an AI to make something completely new, something not based on previous experience. The images are generated from text descriptions meaning that the neural network is able to firstly “read” a description and then make an image, an important example of how neural networks can be creative, one of the defining traits of our species.

The research could be a basis for further studies in the future, and the potential for applying image generation is huge. We could get AI artists; AI could be used for making perfect depictions of wanted criminals just by reading descriptions given by witnesses. Or we could use AIs in quickly making renders of smartphones, tablets, laptops and other devices just by writing how a device should look and then passing the work to a neural network. Possibilities are endless, giving AI another creative trait, making machines one step closer to humans.

Leave a Reply

Your email address will not be published. Required fields are marked *