A tool for Collaborating over GAN’s latent space

栏目: IT技术 · 发布时间: 5年前

内容简介：In January 2020 we finalized the development phase ofGenerative Adversarial Networks, or GAN, was the

In January 2020 we finalized the development phase of Marrow . Shirin Anlen and I are sharing lessons learned during this process, and our post about optimizing and augmenting a small dataset was recently published on towardsatascience . This post looks at how custom web-based tools can inspire a collaborative artistic workflow when working with machine learning models.

Avnerus

Apr 29 ·8min read

A tool for Collaborating over GAN’s latent space — Shadow animation from GAN’s latent space using the web explorer tool

Myself and Marrow

Marrow is a hands-on research project and an interactive theater experience by shirin anlen that explores the possibilities of mental disorders in machine learning . I have previously worked with Shirin on a number of projects, most notably the VR documentary Tzina: Symphony of Longing . In 2018 I joined shirin to preview Marrow as an installation at IDFA Doclab 2018 . The prototype was a success, and one year later we went as collaborators to an intensive development phase co-produced by the National Film Board of Canada and Atlas V .

About GAN and its latent space

Generative Adversarial Networks, or GAN, was the first machine learning model we decided to research . It focuses on generative visual imagery and exhibits a very clear dissonance if you attempt to train it on complex concepts using banal stock images. In a previous post we described how we created a dataset of ‘ Perfect family dinner’ images and used it to train StyleGAN V1 . This particular dataset was constructed to serve the story of the experience; one of a dysfunctional family that sees itself only through the distorted data that it was trained on. Because of this, we aimed for results that are imperfect and represent the glitches that emerge when the model tries to go deep into social narratives.

Our dataset was a bundle of around 6,500 images containing figures of four family members, stripped away from their family dinner setting. Once StyleGAN finished the training process, we ended up with a vast space of possibilities for newly generated images containing four distorted familial figures. The infinite, continuous, space of possibilities for an output image is called the Latent Space . It is “latent” because the output image generated by GAN is determined by a seemingly hidden process of mathematical transformations, starting from a series of numbers, and ending with a bitmap image. When you change any of the initial numbers in the series, the resulting image would be slightly different. The transformation network is so deep, that it’s hard to predict what would change in the image.

If you have a good enough dataset and algorithm, you might be able to reach disentanglement : that is when one of the input numbers controls one meaningful element in the resulting image; for example, one number would change the age of one generated person, while another changes their hair color. Needless to say, we were not able to achieve disentanglement with our small dataset. A change in a single number from the initial series could induce various changes in multiple family members. The same number could simultaneously control one family member’s pose, another member’s smile, and the appearance of a Christmas hat in a third figure (a repeating motif in stock images, it seems). The family members were in fact entangled .

The Shadow Allegory

Marrow tracks each of its models ‘thinking’ process and questions what could go wrong. In GAN, the latent space gives us information about how input data is being broken-down and then reconstructed into something new. But as much as visualizing the latent space is intriguing, we were looking for ways to integrate storytelling into experience. We wanted to materialize GAN’s distorted image of the world.

When watching the ongoing training process of GAN we started noticing things that are other than human, coming from the source dataset. It was like staring at Rorschach tests; flat images that appear different depending on who is watching. We realized that we are learning more about GAN not by seeing the result that we expect, but by seeing its in-between spaces. Plato’s Allegory of the cave speaks about finding meaning in the simple and flattened representation of things. The people in the allegory are stuck in a cave with a fire burning outside. The fire projects the shadows of passing by objects on the cave’s walls, and that is all they can see of reality. They are so used to those shadows, that once a prisoner breaks free, their eyes get burned by the flaring sun. When the prisoner’s eyes are finally accustomed to reality, they come back to the cave to tell the others, but now they are unable to see anything in the darkness. The other prisoners assume that something evil lies outside.

Interestingly, Plato’s allegory of the cave corresponds quite well with the structure and training process of GAN . GAN is in constant conflict between reality, representations of reality, and fantasy. When the algorithm generates images that are too close to the original dataset, it finds itself stuck in a simple and flat representation of the world, unable to escape to pathways of creativity. When GAN’s generations are too fantastical, they are inevitably deemed as fake and wrong. GAN is in a constant struggle to find the balance between the real and the imaginary. Therefore, we decided to visualize GAN’s struggle by using the shadow representation of the distorted family outputs.

Animating over the latent space

Marrow is an interactive theater piece where the participants play the role of machine learning models in a family dinner setting. In the experience, a participant who represents GAN is telling their story about the difficulties they face in discerning memory from imagination — both of those perceptions are in fact distorted in GAN, so we decided to explore at this phase the additional layer of fantastical animated layer over the world of shadows, that would represent the character’s struggle between the real and the fake. We worked with the talented Paloma Dawkins , a master of hand-drawn animations and alternate dimensions. Now we had to ask ourselves: how do we orchestrate a workflow that starts in the mathematical depths of GAN, but ends with hand-drawn animations that perfectly match GAN’s latent movements across the image space? The answer came in the form of our custom-developed tool: Marrow GAN Explorer .

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

猜你喜欢:

A tool for Collaborating over GAN’s latent space

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

精通Android游戏开发

[美] Vladimir Silva / 王恒、苏金国等 / 人民邮电出版社 / 2011-2 / 45.00元

作为引领移动技术潮流的软件平台，Android发布了NDK以支持Java和C的混合开发，使PC游戏可以在Android平台上焕发更多魅力。本书是为那些在Android游戏开发工作中寻求突破的人准备的。书中不仅通过Space Blaster和Asteroids这两个炫酷的街机游戏深入介绍了如何从头构建纯Java游戏，更详细展示了如何将PC上的3D经典游戏Doom和Wolfenstein 3......一起来看看《精通Android游戏开发》这本书的介绍吧!

码农工具