nvidia Self drivin' learnin' the way you do (... creepy)

barn · December 17, 2017

Have your critical minds set to 'active', still it's the times we happen to live in. Mind blowing stuff, really.

Have your perception extended.

For those who wish to read the papers, the rationale..

here

and

this one too

Barnsley

Kikker · December 18, 2017

Well, to explain the ai thing properly. You have two adversarial networks, an adversarial network is a network with two competing neural networks, one tries to produce fake images which look like real images and one tries to distinguish real images from fake images. The result is a network which is capable of producing very realistic images. So two adversarial networks, in this case, one for winter one for summer, which can produce realistic fake images of both domains. Problem is the translation from and to one of the domains to the other one.

In the paper they used an implementation with variational encoders. Variational encoders are used to produce a interface between fake image generation and humans inputting variables. To go one more layer in depth: a adversarial network can produce fake images using random noise therefore each value of random noise corresponds to certain feature in the image. Problem is those variables aren't exactly meaningful for our understanding and trying out every single one of them to map the features which are meaningful is impossible. So you use variational encoders to make use of a Gaussian distribution assumption for each variable. The assumption being that features we find meaningful have a normal distribution in the feature space.

The research groups uses two decoders and two encoders and the assumption of a shared latent space to map a summer (or day) image to a winter (or night) image. So you have a encoder for summer and a encoder for winter and you place restriction (shared-latent space assumption) on them that they should encode to the same space. Then the two decoders can take the encoded image from either of the encoders (which is the actual breakthrough) and then produce a winter or a summer image. This shared latent space could of course be extended to any kind of weather though it would add significantly to the computing cost.

ofd · December 18, 2017

Guess what: the brain fakes input too

barn · December 23, 2017

On 12/18/2017 at 5:22 PM, Kikker said:

Well, to explain the ai thing properly. You have two [...]

Fascinating. If I understand you correctly(wouldn't surprise me if I didn't, being a layman..) you are describing how people program AI-s pattern recognition, creation of more 'correct' abstractions on the long run.

ie. 'what's' a 'chair' and what could be coined as being a chair, roughly speaking.

Kikker · December 27, 2017

On 23-12-2017 at 12:54 PM, barn said:

Fascinating. If I understand you correctly(wouldn't surprise me if I didn't, being a layman..) you are describing how people program AI-s pattern recognition, creation of more 'correct' abstractions on the long run.

ie. 'what's' a 'chair' and what could be coined as being a chair, roughly speaking.

Yes and no. What you're describing isn't the state of the art anymore, since 2011. You can with relative ease make a neural network to recognize a chair. There are two main breakthroughs since we were able to do that. Firstly a chair is a single true or false question, so it's still manageable to make a traditional (fully connected) neural network and use it to recognize just chairs. If you would like to see other objects in the images you either have to make more networks or expand the one currently recognizing chairs to also recognize other objects. The latter increases training time dramatically. Secondly you would need a data-set of manually labeled pictures with chairs in it, not just a few hundred, you need tens of thousands of manually labeled pictures to get good result.

These two problems are solved with the introduction of convolutional networks (inspired by the actual neurons humans use) and adversarial networks. Convolutional networks significantly reduce the computing cost and increase effectiveness of the initial stage of the network. In this initial stage the image is compressed to a standard input and low-level features like edges are extracted. Another property is that the filters used by convolutional networks are local (a specific part of the image). In the an adversarial network we actually make sure that some abstraction takes place and since it's just two networks figuring out those abstractions we don't need to label the data anymore. Also note that earlier in a traditional neural network we don't care what abstractions are made by the algorithm. We just need the correct output (also we don't know what abstractions are made, if any). Now however we can directly see the accuracy of those abstractions in the generated image. Without the convolutional layers of that network the generated image would also be very blurry (higher resolution would be computational unfeasible), but with convolutional layers a crisp image can be generated.

In the video above every object has some kind of abstraction and some kind of transformation to the other weather conditions. So though you don't strictly need this kind of abstraction when simply recognizing a chair, it's definitely needed in the case above.

---

Looking back, it's for me a given that these kinds of networks need to make some kind of abstraction at some level in order to function. But maybe it isn't for you, I hope you can get some intuition from my explanation.

barn · December 27, 2017

Very nice @Kikker

I am not able to say more now, than just that. Later I'll expand, promise.

Barnsley

barn · December 29, 2017

Hi @Kikker

I appreciate the time and effort you put into answering (to me what it seems like) in a way that's rather on the digestible side. Thanks for that.

On 12/28/2017 at 12:00 AM, Kikker said:

On 12/23/2017 at 12:54 PM, barn said:

ie. 'what's' a 'chair' and what could be coined as being a chair, roughly speaking.

Yes and no. What you're describing isn't the state of the art anymore, since 2011. You can with relative ease make a neural network to recognize a chair. There are two main breakthroughs since we were able to do that.

That's a gentle way to put it. I appreciate.

On 12/28/2017 at 12:00 AM, Kikker said:

Secondly you would need a data-set of manually labeled pictures with chairs in it, not just a few hundred, you need tens of thousands of manually labeled pictures to get good result.

Isn't that what everyone does for free for scroogle and alikes? ('click the squares that contains an image of a car')

I know for a fact, that many aspects of webdesign are already implemented into the way interaction is, ie. in A/B testing to name one.

On 12/28/2017 at 12:00 AM, Kikker said:

(inspired by the actual neurons humans use)

Isn't that a bit of a far-strecth... I mean inspired - light, no caffeine, no sugar, no nothing?

How many connections does an average neuron have thousands???

(well, I suppose evolution has a couple billions of years headstart to computer science)

On 12/28/2017 at 12:00 AM, Kikker said:

In the an adversarial network we actually make sure that some abstraction takes place and since it's just two networks figuring out those abstractions we don't need to label the data anymore.

Huh. Reducing cost.

On 12/28/2017 at 12:00 AM, Kikker said:

Now however we can directly see the accuracy of those abstractions in the generated image. Without the convolutional layers of that network the generated image would also be very blurry (higher resolution would be computational unfeasible), but with convolutional layers a crisp image can be generated.

This is going to be a cringe inducing question but if you don't mind, I'll risk it...

Isn't the purpose of learning, is to encounter unknown parameters usually not within the previously projected boundaries?

On 12/28/2017 at 12:00 AM, Kikker said:

In the video above every object has some kind of abstraction and some kind of transformation to the other weather conditions. So though you don't strictly need this kind of abstraction when simply recognizing a chair, it's definitely needed in the case above.

To any keen eyed (or 20/20 vision) observer, it'll be apparent that undergrowth, vegetation appearing/disappearing on lamp/electrical posts shouldn't be a part of reality. It's clear that the video is an approximation only, catering to expectations. It serves the purpose but naturally doesn't stand up for scrutiny, almost just a foreshadow of a 'could be virtual reality 2.0' in my humble opinion. Nevertheless, creepy to me.

On 12/28/2017 at 12:00 AM, Kikker said:

Looking back, it's for me a given that these kinds of networks need to make some kind of abstraction at some level in order to function.

I suppose that's the end goal that so far no one has figured out how to manifest physically.

On 12/28/2017 at 12:00 AM, Kikker said:

But maybe it isn't for you, I hope you can get some intuition from my explanation.

Oh, no man. Absolutely, fascinating stuff. Thanks again for making an effort to translate it into 'plain English'. I'm sure many agrees with me at least on that one.

Barnsley

Kikker · January 9, 2018

On 29-12-2017 at 2:22 AM, barn said:

That's a gentle way to put it. I appreciate.

It's a lot of time spend keeping up to date with progress in artificial intelligence if it's not your field. The only thing I'm wary (and annoyed by) is people taking offense. For example the debacle that played out here.

On 29-12-2017 at 2:22 AM, barn said:

Isn't that what everyone does for free for scroogle and alikes? ('click the squares that contains an image of a car')

I know for a fact, that many aspects of webdesign are already implemented into the way interaction is, ie. in A/B testing to name one.

Well that requires significant pre-processing. aka an algorithm that cuts the relevant pieces out of a larger image (how can you determine that something is relevant?) and then use humans for the final training. Which is probably based on consensus. So you would need a trained network trained in finding data to be labeled, which would be a far simpler network nonetheless.

On 29-12-2017 at 2:22 AM, barn said:

Isn't that a bit of a far-strecth... I mean inspired - light, no caffeine, no sugar, no nothing?

How many connections does an average neuron have thousands???

(well, I suppose evolution has a couple billions of years headstart to computer science)

Ah I didn't mention that . Consider a neural net of 4 layers, each layer has 4000 neurons. Then each neuron has at least 4000 connections and the middle layers (2,3) have 8000. So yes a neural net can have neurons with thousands of connections. But more importantly neurons involved in the early stages of human vision have significantly less connections and have a certain connection structure, that structure was translated to neurons in a neural net. That was the inspired part.

On 29-12-2017 at 2:22 AM, barn said:

This is going to be a cringe inducing question but if you don't mind, I'll risk it...

Isn't the purpose of learning, is to encounter unknown parameters usually not within the previously projected boundaries?

I don't understand how my statement suggested something about unknown parameters. I simply mentioned that more pixels (better resolution) costs more computing power.

On 29-12-2017 at 2:22 AM, barn said:

I suppose that's the end goal that so far no one has figured out how to manifest physically.

Mmm... So I consider abstractions at a much lower level than you refer to here. A abstraction from a sharp contrast in an image to a edge (of something) is for me already an abstractions, pile a bunch of those mini-abstractions together (edges to corners for example) and you can get something useful (like a square).

barn · January 10, 2018

15 hours ago, Kikker said:

On 12/29/2017 at 2:22 AM, barn said:

That's a gentle way to put it. I appreciate.

It's a lot of time spend keeping up to date with progress in artificial intelligence if it's not your field. The only thing I'm wary (and annoyed by) is people taking offense.

I went and checked it out. Well, probably we can agree on that people who aren't for you learning/understanding as much as possible, aren't interested in your wellbeing, aren't self-policing, if you are honest and straightforward they aren't empathetic to it... Ergo, their argumentation isn't preferable even if it holds the best truths available in that moment. Getting emotional and mudslinging is a sign of 'he/she lost it', emotional bias.

Though it's perfectly understandable and because those who recognise such personal (temporary) flaws, we're able to make amends and re-engage in a healthier approach anew. Otherwise, floccinaucinihilipilification, my dude(respectfully).

15 hours ago, Kikker said:

Well that requires significant pre-processing. aka an algorithm that cuts the relevant[...]

Yes, I can see why I was wrong. Thanks.

15 hours ago, Kikker said:

Consider a neural net of 4 layers, each layer has 4000 neurons. Then each neuron has at least 4000 connections and the middle layers (2,3) have 8000. So yes a neural net can have neurons with thousands of connections. But more importantly neurons involved in the early stages of human vision have significantly less connections and have a certain connection structure, that structure was translated to neurons in a neural net. That was the inspired part.

Gotcha. Sure, that clarified it. I see what you had meant now.

15 hours ago, Kikker said:

I don't understand how my statement suggested something about unknown parameters. I simply mentioned that more pixels (better resolution) costs more computing power.

of my ->

On 12/29/2017 at 2:22 AM, barn said:

Isn't the purpose of learning, is to encounter unknown parameters usually not within the previously projected boundaries?

because you had written ->

On 12/28/2017 at 12:00 AM, Kikker said:

Now however we can directly see the accuracy of those abstractions in the generated image. Without the convolutional layers of that network the generated image would also be very blurry (higher resolution would be computational unfeasible), but with convolutional layers a crisp image can be generated.

and I thought (highlighted) are from an objective standpoint 'dreamed up', of course not entirely but principally yes.

15 hours ago, Kikker said:

Mmm... So I consider abstractions at a much lower level than you refer to here. A abstraction from a sharp contrast in an image to a edge (of something) is for me already an abstractions, pile a bunch of those mini-abstractions together (edges to corners for example) and you can get something useful (like a square).

But wouldn't that be according to your preconceived set of definitions?

Other systems might interpret things differently. The movies 'Arrival' and 'Interstellar' comes to mind. (unconceivable sets of 'language' prior to realisation of differing inertia systems)

p.s.{You're making my mind work on full steam, I'm 'having a ball'. Appreciate it.}

Kikker · February 4, 2018

On 10-1-2018 at 12:06 PM, barn said:

Yes, I can see why I was wrong. Thanks.

It's cheaper, just not as cheap as it sounds.

On 10-1-2018 at 12:06 PM, barn said:

of my ->

because you had written ->

and I thought (highlighted) are from an objective standpoint 'dreamed up', of course not entirely but principally yes.

But wouldn't that be according to your preconceived set of definitions?

Other systems might interpret things differently. The movies 'Arrival' and 'Interstellar' comes to mind. (unconceivable sets of 'language' prior to realisation of differing inertia systems)

I tried to answer this question some time ago, but then went to watch those movies to see what you're on about. The thing is, I didn't return to this forum afterwards. ()

So first thing to keep in mind is that neural nets are heavily inspired by the brain, so it's already designed to emulate a "known" inertia system, the one we operate on. Though several complexities of human brains aren't simulated and neural nets don't have nearly as many neurons as a human (even the big ones like in the video).

Given that, those mini-abstractions I talked about is simply a way to describe what happens in a neural net. With images you can to a certain extend trace back what every layer gets in and get a nice little picture of it like below. But what actually happens isn't restricted by the way people try to conceptualize the inner workings of a net. Simply put the abstractions I mentioned are meant for you (and me), to get some idea. An actual neural network can manipulate data in all kinds of ways which don't conform to the way I described there.

Also, when I talked about the accuracy of those abstractions I actually just meant their usefulness in doing what the neural net is supposed to do. I assume that some kind of abstractions take place and that their usefulness determines the outcome.

Edges to features to faces.

barn · February 4, 2018

That's all-right @Kikker

, thanks for chiming back.

Sign In

nvidia Self drivin' learnin' the way you do (... creepy)

Recommended Posts

barn

Kikker

ofd

barn

Kikker

barn

barn

Kikker

barn

Kikker

barn

Important Information