Magenta項目（2018年TensorFlow發展峰會）。 (Project Magenta (TensorFlow Dev Summit 2018))

字幕列表影片播放

♪ (intro music) ♪
It's really just in all of -- what machine learning is capable of,
in how we can extend human capabilities.
And we want to think more than just about discovering new approaches
and new ways of using technology; we want to see how it's being used
and how it impacts the human creative process.
So imagine, you need to find or compose a drum pattern,
and you have some idea of a drum beat that you would like to compose,
and all you need to do now is go to a website
where there's a pre-trained model of drum patterns
sitting on the online-- you just need a web browser.
You give it some human input and you can generate a space of expressive variations.
You can tune and control the type of outputs
that you're getting from this generative model.
And if you don't like it, you can continue going through it
exploring this generative space.
So this is the type of work that Project Magenta focuses on.
To give you a bird's eye view of what Project Magenta is about,
it basically is a group of researchers, and developers and creative technologists
that engage in generative models research.
So you'll see this work published in machine learning conferences,
you'll see the workshops,
you'll see a lot of research contributions
from Magenta.
You'll also see the code that after it's been published
put into open source repository on GitHub in the Magenta repo.
And then from there we'll see ways of thinking and designing creative tools
that can enhance and extend the human expressive creative process.
And eventually, ending up into the hands of artists and musicians
inventing new ways we can create and inventing new types of artists.
So, I'm going to give three brief overviews of the highlights
of some of our recent work.
So this is PerformanceRNN.
How many people have seen this? This is one of the demos earlier today.
A lot of people have seen and heard of this kind of work,
and this is what people typically think of when they're thinking
of a generative model, they're thinking, "How can we build a computer
that has the kind of intuition to know the qualities
of things like melody and harmony, but also expressive timing and dynamics?"
And what's even more-- it's even more interesting now
to be able to explore this for yourself in the browser enabled by TensorFlow.js.
So,this is a demo we have running online.
We have the ability to tune and control some of the output that we're getting.
So in a second, I'm going to show you this video of what that looks like,
you would have seen it out on the demo floor
but we will show you and all of you watching online,
and we were also able to bring it even more alive by connecting
a baby grand piano Disklavier that is also a midi controller
and we have the ability to perform alongside the generative model
reading in the inputs from the human playing the piano.
So, let's take a look.
♪ (piano) ♪
So this is trained on classical music data from actual live performers.
This is from a data set that we got, from a piano competition.
♪ (piano) ♪
I don't know if you noticed, this is Nikhil from earlier today.
He's actually quite a talented young man.
He helped build out the browser version of PerformanceRNN.
♪ (piano) ♪
And so we're thinking of ways that we take bodies of work,
we train a model off of the data, then we create these open source tools
that enable new forms of interaction of creativity and of expression.
And this is all these points of engagement are enabled by TensorFlow.
The next tool I want to talk about that we've been working on
is Variational Autoencoders.
How many people are familiar Layton's space interpolation?
Okay, quite a few of you.
And if you're not, it's quite simple-- you take human inputs
and you train it through on your own network,
compressing it down to an embedding space.
So you compress it down to some dimensionality
and then you reconstruct it.
So you're comparing the reconstruction with the original and trying to train,
build a space around that, and what that does is that creates
the ability to interpolate from one point to another
touching on the intermediate points where a human may have not given input.
So the machine learning model may have never seen an example
that it's able to generate, because it's building an intuition
off of these examples.
So, you can imagine if you're an animator,
there's so many ways of going from cat to pig.
How would you animate that?
So, we're train--there's an intuition that the artist would have
in creating that sort of morphing from one to the other.
So we're able to have the machine learning model now also do this.
We can also do this with sound, right?
This technology actually carries over to multiple domains.
So, this is NSynth, and we've released this,
I think some time last year.
And what it does is it takes that same idea
of moving one input to another.
So, let's take a look. You'll get a sense of it.
Piccolo to electric guitar.
(electric guitar sound to piccolo)
(piccolo sound to electric guitar)
(piccolo and electric guitar sound together)
So, rather than recomposing or fading from one sound to the other,
what we're actually able to do is we're able to find these intermediary,
recomposed sound samples and produce that.
So, it looks, you know, there's a lot of components to that.
There's a wave and a decoder, but really it's the same technology
underlying the encoder-decoder variational autoencoder.
But when we think about the types of tools that musicians use,
we think less about training machine learning models.
We see drum pedals right? -- I mean not drum pedals.
Guitar pedals, these knobs and these pedals
that are used to tune and refine sound to cultivate the kind of art
and flavor a musician is looking for.
We don't think so much about setting parameter flags
or trying to write lines of python code
to create this sort of art in general.
So what we've done--
Not just are we interested in finding and discovering new things.
We're also interested in how those things get used in general--
used by practitioners, used by specialists.
And so we've created hardware, we've taken a piece of hardware
where we've taken the machine learning model, we've put it into a box
where a musician can just plug in
and explore this latent space in performance.
So take a look on how musicians feel, what they think in this process.
♪ (music) ♪
(woman) I just feel like we're turning a corner
of what could be new possibility.
It could generate a sound that might inspire us.
(man) The fun part is even though you think you know what you're doing,
there's some weird interaction happening
that can give you something totally unexpected.
I mean, it's great research, and it's really fun,
and it's amazing to discover new things, but it's even more amazing to see
how it gets used and what people think to create with alongside it.
And so, what's even better is that it's just released, NSynth Super,
in collaboration with the Creative Lab London.
It's an open source hardware project.
All the information and the specs are on GitHub.
We talk about everything from potentiometers,
to the touch panel, to the code and what hardware it's running on.
And this is all available to everyone here today.
You just go online and you can check it out yourself.
Now music is more than just sound right?
It's actually a sequence of things that goes on.
So when we think about this idea of what it means
to have a generative music space, we think also about melodies,
and so just like we have cat to pig,
what is it like to go from one melody to the next?
And moreover, once we have that technology, how does it --
what does it look like to create with that?
You have this expressive space of variations--
how do we design an expressive tool that takes advantage of that?
And what will we get out of it?
So this is also another tool
that's developed by another team at Google,
to make use of melodies in a latent space,
so how interpolation works,
and then building a song or some sort of composition with it.
So let's take a listen. Say you have two melodies....
♪ ("Twinkle Twinkle Little Star") ♪
And in the middle....
♪ (piano playing variation) ♪
You can extend it....
♪ (piano playing variation) ♪
And we really are just scratching the surface of what's possible.
How do we continue to have the machine learn
and have a better intuition for what melodies are about.
So again to bring it back full circle, using different compositions
and musical works, we're able to train a variational autoencoder
to create an embedding space that builds tools that enable
open source communities to design creative artist's tools
to look at new ways of pushing the expressive boundaries
that we currently have.
This is, again, just released! It's on our blog.
All the code is an open source and made available to you,
and also enabled by TensorFlow.
In addition to all these other things, including Nikhil,
here, enabled by the type of work and creativity and expressivity.
And so, in wrapping up I want to take us back
to this demo that we saw.
Now the most interesting and maybe the coolest thing about this demo,
was that we didn't even know that it was being built
until it was tweeted by Tero, a developer from Finland.
And the fact of the matter is that we're just barely scratching the surface.
There's so much to do, so much to engage in,
and so much to discover.
And we want to see so much more of this.
We want to see more developers, more people sharing things
and more people getting engaged.
Not just developers, but artists and creatives as well.
We want to explore and invent and imagine what we can do
with machine learning together as an expressive tool.
And so, go to our website, g.co/magenta.
There you'll find our publications and these demos,
you can experience it yourself, and more.
And you can also join our discussion group.
So here's g.co/magenta.
Join our discussion group, become part of the community,
and share the things that you're building, so we can do this alongside together.
Thank you so much.
(applause)
So that's it for the talks today.
We had an amazing, amazing show, amazing spread of speakers and topics.
Now, let's take a look at a highlight review of the day.
♪ (music) ♪
Earlier this year we hit the milestone of 11 million downloads.
We really excited to see how much users are using this and how much impact
it's having in the world.
We're very excited today to announce that deeplearn.js
is joining the TensorFlow family.
♪ (music) ♪
(man) The software TensorFlow, is also an early-stage project.
And so we'd really love for you to get interested
and help us to build this future.
♪ (music) ♪
(man) I told you at the beginning that our mission for TF data
was to make a live [inaudible] processing that is fast, flexible and easy to use.
♪ (music) ♪
(woman) I'm very excited to say that we have been working
with other teams in Google to bring TensorFlow Lite to Google Labs.
♪ (music) ♪
(man) In general the Google Brain Team's mission
is to make machines intelligent,
and then use that ability to improve people's lives.
I think these are good examples of where there's real opportunity for this.
♪ (music ends) ♪
(applause)
♪ (music) ♪