Director GAN Concept

One GAN to rule them all

Motivation (Why)

With the increasing data size and computation time to generate quality outputs, there is a need to use the pre-trained models more and more. But currently, there is no such method that provides the capability to conveniently utilize multiple GAN without re-training them. To deal with these issues, I am proposing a concept that would be capable of utilizing pre-trained AI models as-is and producing a high-quality result by adding a layer of Director GAN AI on these. This AI will compute and decide which generated output needs to be picked/used based on the requirement or prompt at hand.

Techtics (What & How)

The first time I realized that there is a need for such a layer of AI was when I was working on DeepFakes for Good, trying to map the facial movements, head movements, to make 'Dwight' from the show 'The Office' deliver a dialogue of 'Gollum' from the 'Lord of the Rings'. I realized that if only I could utilize 2 different pre-trained models and get a perfect output, it would be great. Encountered this problem again when I noticed accent bias while generating audio DeepFakes, wondering, Can't we generate output with a diverse accent, way of speaking, etc. based on what prompt was chosen?
Then thought to myself, I’d love to work on is creating a Director-AI that understands when to use which pre-trained model to get the best output for any objective; similar to that of a conscious layer of mind over the subconscious. Below is an example flow diagram considering the prompt is to generate a face and a pre-trained model of generating certain feature of the faces are used as input.

Reflection & Application (now & next)

There could be many possible applications of Director GAN, for example, we can create many very well-trained generative AI models and then use them as-is, interfacing them together. The Director GAN would be responsible for accessing which result fits the contextual prompt the best. The process is similar to that of building a good team, can do wonders!
Furthermore, this has the potential to reduce the utilization of computational resources while creating high-quality results.

Creators

Suryakant Sahoo