瞭解城市環境，通過使用... (Understanding Urban Environments Through the Use of...)

字幕列表影片播放

SPEAKER 1: Remco was previously working at Boeing.
And he got his degree from Brown University.
And he will tell us about how to simplify dance models with
the three buildings while still preserving legibility.
Thanks, Emil.
REMCO CHANG: Thank you guys for coming here.
It's a great privilege to be at Google.
So I'm here today to talk about kind of a broader
research question that I've been working on.
And the idea is try to understanding urban
environments through the use of a thing called urban
legibility.
And I'll talk about that a little bit.
So this talk is actually going to get broken down
into about two parts.
The first half is going to be on simplification of the urban
models, while maintaining urban legibility.
And this is a talk that I gave at SIGGRAPH this summer.
So I apologize to people that were there.
It will be a repeat, almost verbatim repeat.
And the second half will be on discussion and future work,
where I'll be talking about some of the things that we've
been working on, as well as some of the things that we
would like to be working on in the future.
And before I start, I'd just like to talk a little bit
about what we're doing at the
visualization center at Charlotte.
And specifically, one thing that we've been really
interested in is in this idea of knowledge visualization.
And to give you an example of what we mean by knowledge
visualization, you can imagine like you have a screen of
labels, there's lots of text, and they're kind of
overlapping each other.
Now, if somehow you can take this screen and extract some
sort of higher knowledge, either from the scene or from
the user, then it is theoretically possible that
you can focus your resources on the things that you want
the user to focus on.
So in other words, we think of it as, you want to minimize
the resources that you're using, but you want to
maximize the information that you're giving to the user.
And resource can really be anything.
It can be CPU time, it could be a number of polygons.
And at this particular case, it really is just a number of
pixels that you're using on the screen.
To give you an idea what we consider as a great knowledge
visualization paper, here's something done by Agrawala and
Stolte, who are both at Stanford.
They're with Pat Hanrahan's group.
And this paper is on rendering effective route maps.
And here's an example of directions that would be given
by MapQuest. And you see that this direction
is physically accurate.
It shows you exactly how to get from point a to point b.
But that's about all it is.
You don't really get a lot of information out of this.
Whereas, typically speaking, if you ask somebody to give
you directions, this will be something that
people would draw you.
So this hand sketch thing is not at
all physically accurate.
I mean, it's showing the highway 101 into
a very small amount.
Where if emphasis more on how you get onto highway 101, or
110, and how to get off.
So of course, in their paper that was published at
SIGGRAPH, 2001, they were able to mimic what
humans typically do.
And in this case, really showing off the important
information in this task of giving people
directions and maps.
So we want to take this idea of knowledge visualization and
apply it to urban models.
So here's a model of a city in China.
And the question is, what is the knowledge in this scene?
What is it that we want to be able to
preserve and to highlight?
To answer that, we turn to this idea of urban legibility.
And urban legibility is a term that was made famous by Kevin
Lynch in his 1960 book called The Image of the City.
So what he did for this book was that he went around the
city of Boston, and he just asked local residents, and
asked them to sketch out--
just kind of use a pen and sketch out their immediate
surroundings.
So what he actually got was a stack of these images that you
see on the right here, where people just simply sketched
out, you know, this is where I live, this is a big road
around me, and so on.
And he took this stack of sketched images, and he
categorized the important things into five groups.
He categorized into paths, which are highways, railroads,
roads, canals.
Edges, shore lines or boundaries.
Districts, industrial, residential district.
Nodes, which you can think of as areas where lots of
activities converge.
So as an example, Time Square in New York City.
And then landmarks.
And landmarks can really be anything.
It can be a tree, it can be a post sign, it
can be a big building.
It's whatever people use to navigate themselves in an
urban environment.
So Kevin Lynch defined this idea of urban legibility as
"the ease with which a city's parts may be recognized and
can be organized into a coherent pattern." So that's
kind of a mouthful.
But to me, what that really says is, if you can somehow
deconstruct a city into these urban legibility elements, we
can still be able to organize a city in a coherent pattern.
The use of urban legibility in computer science really goes
back a little ways.
Ruth Dalton, in her 2002 paper, just chronicles the
history of what people have done in computer science in
the use of urban legibility.
And it kind of broke down to two groups.
There's one group that tries to justify whether or not the
idea of urban legibility actually makes sense.
So what they did was, they tried to figure out if these
elements are actually important to human navigation.
And what they found out, interesting enough, is that
paths, edges, and districts are very important to
navigation, but landmarks are kind of questionable.
There's some groups that think that it's very useful, there's
some groups that say that it's totally useless.
And the one element that's missing here is
the element of nodes.
And people have not really been able to successfully
quantify what really a node is.
So there hasn't been as much research done on trying to
figure out if nodes are helpful at all.
And the other group of researchers just use urban
legibility, and in particular, in graphics and visualization.
Most notably, Ingram and Benford has a whole series of
papers where they try to use urban legibility in navigating
abstract data spaces.
So the question is, why did we decide to use urban
legibility?
And to give you an idea, here we take an original model.
These are a bunch of buildings in our Atlanta data set,
looked at from a top down view.
This is what you would get if you use a traditional
simplification method, such as QSlim.
And I'm assuming people know what QSlim is.
But what you see is that a lot of the buildings get decimated
to a point where it doesn't really look
like a building anymore.
Whereas, our approach is a little bit different.
We take an aggregated approach, and this is
what you will get.
And if we're apply a texture map onto our model, this is
what you end up at the end.
So, it's actually really interesting that when we take
these four models, and we put in the fly-through scene, just
kind of a test scenario, and we measure how many pixels are
different from the original model.
And this is the graph that we get.
So you don't have to look at it carefully.
But the important thing here that I'm picking out is that,
basically, using all these models, they end up with very,
very similar differences in terms of pixel errors.
And what that says to us is that, even though you look at
these four models and you say, well, they look very
different to me.
But in effect, if you measure it purely quantitative using
pixel errors, they actually come out to be very similar.
So what that really says to us is, we can't really just use
pixel errors as the driving force behind simplification of
urban models.
We have to use something a little bit different.
We have to use a higher level information in here.
And to simplify this, let me just state, our goal for this
project is to create simplified urban models that
retain the image of the city from any
view angles and distances.
And as an example of what we get, you see the original
model on the left.
The middle image shows the model having been reduced to
45% of the polygons.
And the last one is 18%.
And you kind of see a little bit of a dimming effect
across, when it goes from original to less polygons.
But the important thing here to notice, that when you're
doing this, the important features in
the city are retained.
So for example, the road here is still kept.
The city square area is kept.
And you pretty much still get the sense that this is the
same city that you're looking at, even though there's only
18% of the polygons in the scene.
I'm just going to run the application really quickly,
and hopefully, nothing goes wrong.
OK.
So this is using the Chinese city data set.
And this is running live.
So as you can see, I can just kind of look around.
Move to different places.
And here--
hold on one second.
This is where the demo goes wrong.
OK.
So I'm just going to start zooming out from this view.
AUDIENCE: Can you mention how you got that geometry in the
[UNINTELLIGIBLE]?
Is that made out [UNINTELLIGIBLE PHRASE].
REMCO CHANG: Those textures are totally fake.
AUDIENCE: [UNINTELLIGIBLE PHRASE].
REMCO CHANG: The geometry is actually real.
So what we got was, we got the original footprint
information, and we got approximate height information
in terms of number of stories, or number
of flights per building.
And we estimated that each story is about three meters.
So the geometry, it's kind of the extrusion of footprints.
So it's not real in terms of the true 3D models, but the
footprints and the positions are
actually absolutely correct.
AUDIENCE: Do you [UNINTELLIGIBLE] the fact that
[UNINTELLIGIBLE] you get repeated texture patterns?
REMCO CHANG: There is definitely some.
But I'll talk about that in a little bit.
Yes, sir?
AUDIENCE: What kind of specification
[UNINTELLIGIBLE PHRASE]?
REMCO CHANG: As it turns out--
I'll get into that a little bit later, too--
this is--
AUDIENCE: [UNINTELLIGIBLE PHRASE]
REMCO CHANG: Oh.
OK, sorry.
So the question was, what kind of hardware I'm
running this on.
And the honest truth is, I have no idea.
But what I do know is that this is kind of the state of
the art laptop from Dell.
But as it turns out--
I'll explain this a little bit-- but the bottleneck's
actually not in the graphics card.
It's actually in my crappy code where I'm not
transferring data fast enough.
It's the pipeline that's actually the
bottleneck right now.
But that's just my fault.
I wrote some crappy code.
So here I'm just zooming out from
that particular viewpoint.
And to demonstrate my point, we're just going to keep
zooming out and keep zooming out.
Keep zooming out.
And at this point, I'm going to freeze the level of detail
simulation.
I'm taking away the ground plane, I'm
taking away the textures.
And when I zoom back in, you see that this is actually what
was rendered when you're that far away.
And you can see the number of polygons at the
bottom, right here.
In terms of how many polygons was actually seen when you're
really that far away.
So the important thing here to note is that, even though we
are doing some pretty drastic simplification, we still try
to retain the important features in the city.
And just to give another quick example of this, I'm just
going to run this without the texture.
We also take into consideration of high
preservation.
So what that means is--
aah.
So I can be looking at a city from kind of an oblique view.
And again, if I freeze this process, and I zoom out, you
see that there's a lot more detail in the round where the
viewpoint is.
And as you go out further, the models are all greatly
simplified.
But the interesting here to notice, even for objects that
area far away, the tall buildings are still rendered
separately.
So you kind of get that sky line, even when you're looking
at it from different view angles, even
close to ground plane.
OK.
Just to give you an idea about what work has been done in
terms of urban flights or urban walk-through--
and people want to try different things.
You know, visibility and occlusion is a popular one.
And Peter Wonka has a great paper called Instant
Visibility in 2000, and Schaufler has another great
paper in 2000 as well, in SIGGRAPH.
And in here, the idea is that if you use occlusion or
visibility, you can reduce the number polygons that you're
rendering, so that you can actually see a bigger scene,
without actually seeing the bigger scene.
And there are people who use impostors, or
billboards with textures.
Started with Marciel and Sirnley in '95.
And then Sillion in '97 extended it to kind of blend
between imposters, as well as real geometries.
And in '98, Shalabi, his PhD thesis, extended Sillion's
work, but added in some elements of legibility in
there as well.
They're procedurally generated buildings.
I think this was really started by Peter Wonka in
2003's paper called Instant Architecture.
In this year's SIGGRAPH, Pascal Mueller has a great
paper on procedurally generated buildings.
And then lastly, we just have popping, like Google Earth and
Microsoft Live.
So there are about five steps to our algorithm in order to
preserve legibility.
And in order, the first thing we do is, we try to identify
and preserve the path and the edges.
And we do that through this hierarchical clustering that
I'll be talking about.
The second step is in creating logical districts and nodes,
and we do that through cluster merging.
Then the third step is simplifying the model, while
trying to preserve paths, edges, districts, and nodes.
And that's done through our simplification process.
Then we hierarchically apply the appropriate amount of
texture, and that's through the texturing process.
And these four steps combine to become the
pre-processing step.
And at the end of this pre-processing step, you end
up with a hierarchy of meshes, as well as
hierarchy of textures.
Then we feed all this stuff into the run time process,
where we try to highlight the landmarks, and choose the
appropriate model to render, based on the viewpoint.
And that's through the LOD with landmark
preservation process.
So first I'll talk about how we do preservation of paths
and edges through hierarchical clustering.
Here you see the result of two clustering methods.
The one on the left is more of a traditional thing, like
k-means and whatnot.
And the right is what we use.
And you can see the--
this is cool animation--
you can see where the two clusters meet in the first
example, it doesn't really follow any
sort of logical paths.
Whereas, in our implementation, we really try
to make sure that the two clusters are separated on a
logical road.
And the way we do that is by using something called
single-link clustering.
And that's a pretty simple idea.
It's basically, iteratively groups the two closest
clusters together, based on, in our
situation, Euclidean distance.
So as an example, let's say that you have six buildings to
start with, a, b, c, d, e, and f.
And you just start grouping them two at a time.
And eventually, what you get back is a binary tree, or
something called a dendrogram.
The thing that's interesting here to note is that, in this
particular scenario, the dendrogram is actually very
unbalanced.
On one side, you have node a.
On the other side, you have b, c, d, and e, and f.
And this doesn't work well at all for our purposes, so we
had to do a little bit of balancing of the tree by
penalizing larger clusters.
And the idea here is, we want to create a
more balanced tree.
Here's some simple images of the first few steps of these
hierarchical clustering processes.
And you can see, red is one cluster, green's the other.
And we just follow one particular
path down this process.
You can see that, throughout the simplification process,
the divide between the two clusters mostly follow the
road pretty well.
So once we have the clustering, the next thing to
do is to merge the clusters together to create logical
districts and nodes.
Here's an example of what we mean by
creating a logical district.
You have one cluster of buildings in red, the other
cluster of buildings in yellow.
And the blue outline shows what the merger will be.
And this is what it should look like, where the blue
outline should just encompass all of them together,
including the black space in the middle.
So the way we do that is pretty simple.
We first find the footprints of each building, ordered in a
counterclockwise manner.
Then you find the two shortest edges that will combine these
two things.
Then you just start tracing it, starting
with one of the vertices.
Now, when we start with the magenta vertex, what we end up
with is actually what we're looking for, which is the
resulting merged hull.
Now it's interesting to note that, when we start with a
different vertex, and again, the counterclockwise fashion,
what we get back is what we call the negative space.
Or you can think of it as the error that's introduced by
merging these two hulls together.
And this is actually very important, because the
negative space is what we end up using in determining what
model to render later down the process.
So once we have that, then the next step is to simplify it.
And here's an example of what I mean by simplification.
You see on the image on the left, it's the merged hull of
the Atlanta data set.
And here you have about 6,000 edges.
And the idea here is, we want to simplify it without really
killing too much of the features of the city.
And in this case, we diminish it to about 1,000 edges or so.
And I'm just going to show you real quick how this works.
OK.
So this is just a small data set that I'm showing.
And right now, you see the blue outline, that's the
merged hull of all of these buildings together.
And as I start sliding the slider down, I'm actually
doing simplification as I speak.
So you can start to see that little features are starting
to be filled in.
And I just keep going.
So what's interesting about this algorithm that we
developed is that, eventually, if you just keep going at it,
you get back to convex hull.
So it's not at all the most efficient way of finding
convex hull, but you do find the convex hull.
Right.
So once the polylines have been simplified, or once the
merged hull has been simplified, we create what we
call cluster meshes.
these are nothing more than protrusion of the footprints,
and where we determined the height of this cluster mesh to
just be the median height of all the
buildings in the cluster.
And this is what we mean by what a cluster mesh is.
Once we have that, then we apply texture onto all these
cluster models.
And we do it in the hierarchical fashion.
So first of all, we give each cluster mesh
six different textures.
We give it a side texture.
We give it a top down view of the roof texture.
And we kind of do an impostor-ish kind of thing,
where we give four roof textures from
four different angles.
And then we put these cluster meshes into bins, based on how
visually important they are.
And the idea here is that, if you have a particular cluster
mesh that you know won't show up until it's like really far
away, then what you can do is, you don't have to give it as
much texture real estate.
So, in such a case, what you would do is put it in the
earlier bins.
And I'll explain that in a little bit.
Each of these bins that you see, n/2 and n/8, each bin
will contain the same amount of texture resolution.
So what that means is, if you have a particular cluster mesh
that's put in the first bin, meaning the n/2 bin, the
amount of texture resolution that it will receive will be a
lot smaller, because there will be more people
competing for it.
Whereas, further down this pipeline, the more visually
important clusters will actually get a lot more
texture resolution.
So this is a way for us to actually control the texture a
little bit, because texture in a lot of these urban models,
as you mentioned earlier, is a pretty big problem.
So once we're done with all this, we put the resulting
cluster meshes, and the hierarchy of meshes, and the
hierarchy of textures, into our run time system, where we
try to preserve landmarks, and choose the
appropriate models to render.
So first, we talk about how we choose the
right model to render.
And we start with the root node of our dendrogram.
In this case, a, b, c, d, e, and f.
And then we take the negative space of that particular
cluster mesh, and that's shown here as an approximated 3D
box, that's shown in red.
And we project that box onto screen space.
And if the number of pixels that occupies exceeds some
user defined tolerance, then what we do is we reject that
node, and we recursively check for it's two children.
And it just keeps going until you find the appropriate cut
in the LOD tree.
Next thin I talk about is the landmark preservation process.
Here you see the original sky on the left.
The middle image shows without the landmark preservation.
And the last image shows our method with the landmark
preservation.
And all you really need to do is you add a few buildings
that's visually important, and you really give back the sense
that the last image is very similar to the original.
and the way that's done is, basically, projecting a user
defined tolerance, alpha, onto each cluster mesh.
And if there's any building in that cluster that's taller
than alpha height, then it will be drawn on top of the
cluster mesh.
In other words, here's a scene of cluster meshes.
And you see here, on the lower right hand corner, is a little
green line.
And that's the projected alpha height.
And these buildings are drawn separately, because these
buildings are taller than alpha height, so those will be
drawn separately.
And I talk a little bit about the results of what we get out
of this system.
Here you see this little line, blue line, that runs across
the bottom of the scene.
And that's rendering using the unsimplified meshes.
And it's just a constant frame rate.
And we see that, in this particular case, we have a
fly-through scene, where the camera starts out really far
away, zooms in to the model, and flies back out again.
So what you end up seeing is, when the camera is far away on
the top, you have really, really great frame rate.
But once you start to really, really zoom in, to the point
where you're almost touching the ground, then the problem
is the overhead of traversing your LOD tree really catches
up with you.
In which case, the frame rate is actually worse than if you
just rendered the entire scene independently.
And correspondingly, in terms of the number of polygons
that's rendered, you see the line on the top.
And that's the number of polygons in the static, or the
unsimplified, model.
And you see that the two graphs are kind of are inverse
of each other, because the number of polygons is
inversely related to the frame rate.
So in conclusion, there are a few things that we found by
working on this project.
The first one being that, when you just use per-pixel error,
it is actually not all indicative of the visual
quality of your simplified urban models.
So what that means is, you really have to go towards some
higher level knowledge, in our case, from city planning, to
help extract and retain visually salient
features in the model.
In our particular case, using urban legibility, we find that
it allows efficient and intuitive simplification of
the models.
There's some great limitation to what we're doing here.
The first one is, this rendering engine that I
implemented is horrendous.
I cut all kinds of corners.
So we're not using display lists, vertex arrays, any of
the newer ideas with graphics card programming.
So that needs to be worked on.
The second thing is, the pre-processing step takes
absolutely forever, because all the algorithms are
basically n cubed processes.
We can think of ways to simple it down a little bit, but it's
something that we need to work on.
And something that's inherent to this algorithm is that we
end up with binary trees as our hierarchy tree.
And binary trees says just inherently deeper than
quad-trees or oct-trees.
So traversing binary trees just takes longer.
And the last thing is, we really want to improve upon
this by allowing user interactions.
And what we mean by that is, you can imagine that
districts, in a logical sense, don't always follow roads or
geometric distances.
For example, if you have a historical residential area
downtown, chances are, that's not bound by roads.
I mean, that's probably just right immersed
into everything else.
But to the local residents, that district is maybe
particularly important.
So we need to have the kind of interaction where we allow the
user to be able to manually put in going there, and say,
no, this area is a little bit different.
This is a historical region, for example.
And that pretty much is it.
So I want to thank the co-authors on this.
Tom, Caroline, Zach, Bill from Charlotte, and my adviser from
Brown, Nancy Pollard, who's at Carnegie Mellon at this point.
And this project was funded by the Army MURI project.
And some other people that I work with in the architecture
department.
So this is part one.
And I'll just take some quick questions if you want.
The second half, I'll be talking a little bit about
what we want to do in the future and all.
SPEAKER 1: So for the question, please remember that
this talk will be available on Google video.
So reserve all the questions with the confidential content
for afterward.
REMCO CHANG: Yes, sir?
AUDIENCE: So you mentioned that the per-pixel error is
not a good measure of quality.
How do you measure your quality?
How do you decide what's good quality rendering?
REMCO CHANG: Do-do-do.
You know what, we honestly have no idea.
Sorry.
The question was, if per-pixel is not a good indicator of the
visual quality of the simplification, how do we
measure our results?
And the short answer to that is, we have no idea.
I think it's interesting, similar to the paper by the
Stanford group, where we did the efficient route maps, it's
kind of the same thing.
Because how do you justify that that kind of direction
giving is good?
Well, unless you really do a full blown user study, you
won't really know.
And I talked to some people about that particular project,
and people agree that probably 99.0% of the people in the
world would believe that that was a good sketch or drawing
of directions.
However, they mentioned some people living in the islands
of the Pacific that really believe that the earth rotate
around them.
In which case, chances are that drawing would be
absolutely useless to them.
So it is very subjective.
We thought long and hard about it.
And I think, at the end of day, to be able to justify
this as useful, you just have to do a user study.
Yes, sir?
AUDIENCE: [UNINTELLIGIBLE] how you merged the footprints of
two neighboring buildings.
That was in the two dimensional [UNINTELLIGIBLE],
so how do you actually blend the geometry on two
buildings in 3D?
[UNINTELLIGIBLE PHRASE]
REMCO CHANG: So the question is, the cluster merging
process is all in 2D.
How does that extend to 3D geometry?
Short answer for that is, it doesn't really
extend to 3D geometry.
We basically assume that all the buildings are 2 and 1/2D.
So all you have to do is play in the 2D realm, and do
protrusion from that.
And I'll talk a little bit about future work about the 3D
geometries.
Yes, sir?
AUDIENCE: So [UNINTELLIGIBLE] working nicely, because you
have [UNINTELLIGIBLE].
How would it work with like European style cities?
REMCO CHANG: What's the difference?
AUDIENCE: Well, they don't have [UNINTELLIGIBLE].
REMCO CHANG: Yeah.
OK.
Right.
So the question is, how would this work for true
3D models, I guess?
And again, the short answer is, I don't know.
But I'll talk a little bit about what we plan on doing in
the followup.
Yes, sir?
AUDIENCE: [UNINTELLIGIBLE PHRASE].
REMCO CHANG: Sorry.
n is the number of buildings.
Actually, let me take that back.
In some cases, it could be the number of vertices.
It's a terrible algorithm.
n cubed is actually the absolute maximum, right?
So in reality, it doesn't run at n cubed, but regardless, we
never really tested the theoretical limit of where it
really is running at.
But that's something that we're definitely thinking long
and hard about.
Yes, sir?
AUDIENCE: Yeah, just to put it in perspective, how many
machine hours did it take to [UNINTELLIGIBLE] for demo?
REMCO CHANG: For this demo, it was actually pretty quick.
In this particular demo, there's about
30-some, 40,000 models.
And that was probably, I would say, half
an hour to 45 minutes.
But I will honest with you, when we push it up about
50,000 to 60,000, I let it run over a week and
it never quite finished.
That n cube really catches up with you.
OK.
I'm going to quickly go over the second part of this, where
I just want to pick your brains a little bit about what
we plan on doing in the future.
And so you guys can definitely help me out in terms of
finding a good direction for this.
Oh.
This doesn't show up very well.
This is roughly my research tree.
And it's separated into three different categories.
On the left hand side, we have mostly more core graphic kind
of problems. The middle side is more on visualization of
urban forms. And the right hand side is a blend between
architecture, specifically urban morphology, and what
we've been working on so far.
So I start with talking about the architecture of urban
morphology.
The idea here is that, I just talked all this thing about
urban legibility, and how we use of urban legibility for
simplification.
But the fact of the matter is, I never actually got back to
saying well, what are the legibility elements in a city?
We have no idea, where is the real road, where is a park, or
anything like that.
We just used the idea to do the simplification.
Now, it would be really cool if we can somehow go back and
find out, what are the elements.
What are the important features to a city?
So this we consider as kind of a feature extraction.
And you can imagine that, if we overlay a model using very,
very strong simplification, or a lot of simplification, we
might end up with an image like this.
When we overlay that on top of a map, now we can start to see
that here's a road.
And on the GIS data site, it might say that that road is
Main Avenue, or whatever.
And this area is something or other.
Now we can start extracting, really, what are the elements
that remain after our simplification?
And we can do this-- oh, sorry, just pointing out some
of the features here--
and we can do this in an interactive manner, where we
start to do more and more simplification.
And by doing that, more and more elements
will start to appear.
And eventually, you'll get all the way from the very, very
important features, down to the very, very fine grain
level of detail.
And what that gets us is, now we can have a ranking, or a
hierarchy, of urban legibility elements, starting with the
most important to the least important.
And this allows to be able to quantify and identify the
urban model.
In other words, this allows us to actually have some semantic
understanding of the model that we're looking at.
And this is important, because if we can start to do this,
then we can start comparing between cities.
So let's say, currently, if you ask any architects-- are
there any architects here?--
OK.
Good.
If you ask architects, what is the difference between New
York City, Washington, DC, and Charlotte?
They would tell you that, in a very abstract and a very
subjective way that, oh, New York City is
made up of a grid.
Washington, DC is more of a ray-like structure, where all
the roads are emanating from the Congress
and the White House.
And then there's Charlotte that just has
terrible urban planning.
So the roads are just absolutely everywhere.
But these are things that people can tell you in a very
subjective way, right?
But this is not quantifiable, this is not repeatable.
And our goal is that, if we can somehow start to quantify
a, city then we can start really comparing between these
different cities in a more objective manner.
And along the same line, we can start to track how a city
changes over time.
Again, in a quantifiable manner.
Here I just show two images.
Again, this is Charlotte in 1919.
And then this is Charlotte today.
And this is roughly the same region.
But you can see that a lot of expansion has occurred.
And nobody else can really tell you what, quantifiably
speaking, what are the things that occurred over this
hundred years.
But what people can tell you is that, in 1919, Charlotte is
a pretty perfect grid.
Oh, didn't show up very well.
But in 1919, Charlotte is a very good grid.
Our downtown area is, you know, First Street, Second
Street, Third Street, and so on.
But once you start expanding, then the road just go
all over the place.
Now, we presented this idea to a workshop.
And we found out that this whole idea of being able to
track the city changing over time is still a fundamental
challenge in the GIS community.
People have not had a good grasp on how to be able to do
this, in terms of a city changing over time.
Along the same line, if we can start to quantify a city, the
other thing that we'll be able to start
doing is smart labeling.
Or in other words,
position-based intelligent labeling.
So for example, if I give you a model,
again, this is of China.
Somebody says, I'm here.
And then ask about, well, what are the important
things around me?
Now we can start to say, well, here's a canal in front of
you, here's E Street, this is Main Ave, and so on.
An interesting thing I want to point out is, now we can start
to show the important features, based on their
relationship to where you are.
So E Street, for example, might not be very important in
the global sense, but locally, to you, it
might be very important.
So we'll be able to highlight that because it's close to
where you are.
Similarly, Main Avenue, even though it's not related to
you, because it's important, we'll show it.
And then, you can start to show greater
regions, such as downtown.
And just group all the elements
within it into a district.
The second half of this is more of an academic question.
Now, if we can actually do the thing that we just said, can
we reverse this process back to it's beginning?
In other words, can we start to sketch out a mental map
similar to the ones that were drawn in Kevin
Lynch's 1960 book?
So we believe that there is a direct correlation between
being able to do this point-based, intelligent
labeling, and being able to retrieve a
sketch of a mental map.
So this will be a similar idea to what the Stanford group was
doing with intelligent route maps.
But instead, we'll be doing mental maps instead.
The second half of this research direction is on
visualizing urban forms. And this is a very interesting
problem for the architects, because they mentioned that
the study of urban form really hasn't changed at all since
the early 19th century.
People today are still looking at urban forms that are either
2D or 3D maps.
So for example, this is something taken from ArcGIS,
which is a product by ESRI.
And this is what people normally look at when the
urban planner is looking at an urban model.
Now, if you look at this, and even though you overlay
additional layers of GIS information--
in this case, you have some street information, you have
some electricity information--
the bottom line is, these maps do not actively help the
viewer understand changes or trends occurring in the city.
In other words, it really doesn't help an urban planner
to be able to do his job any better.
So what we'd like to do is to start to apply some things
that we do know as visualization people.
We want to apply some of these visual analytics techniques to
urban form.
So let's say that we start to integrate a lot of these more
abstract visualization elements, alongside
this urban form idea.
Now we can start to really see how a city changes, just in a
more data kind of perspective.
And by the way, this is not actually built.
The background image, this whole system, was actually
built for Bank of America on international wire transfer
fraud detection.
But it looks good together.
And we showed this whole idea to the architects, and they
immediately found things that they can start using.
For example, the bottom image or the bottom window shows
changes over time.
and they can start to see how they can put in things that
are changing the urban model into such format.
So we believe that this is very promising.
And the last part is more of a core
graphics kind of question.
And this I'm definitely doing a lot of head waving.
There are a lot of people here who know a lot more
about this than I do.
But the first thing that we start realizing is, using our
method, we can really do some model compression.
And on top of that, to be able to progressively stream the
changes from coarse to fine.
This is really well built into the system that we have, so
this should be a very simple step.
But I think it's an important step, where not all the data
has to be transferred all the time.
You can just transfer incremental changes.
And the second thing is, we want to be able to extend this
to 3D models.
Now this is a much bigger problem, because this includes
true 3D models of buildings.
And this also includes what we consider as trees and forests.
Trees and forests is a very interesting topic in terms of
GIS modeling, as such.
Because people really haven't had a good grasp of this.
Most people still do trees as billboards, or
a combination thereof.
And realistically speaking, people haven't really done a
great job in terms of seeing a whole lot of trees.
So to us, the problem is very similar.
On one side, we showed that we can do simplification of lots
of very simple 3D, 2 and 1/2D urban models.
And trees and forests are the same way.
People have been able to L-system trees and whatnot,
where you can procedurally generate trees.
But viewing trees in a large quantity is still a very
difficult problem.
So we think that we might give be able to use some of the
ideas that we gained from doing 3D or 2 and 1/2D
building models, and apply it to trees, and be able to take
a stab at visualizing large quantity of
trees, if not forests.
So we took a quick stab at this.
And what you're seeing here is basically a 3D model
simplified using our method, and eventually get back to 3D
convex hull.
It's very preliminary.
And the truth is, going from 2D to 3D, there's a lot of
technical difficulties that we haven't
complete figured out yet.
But we're taking some stabs at it, see where it goes.
And that pretty much is the talk.
Is there any questions?
Comments?
Yes, sir?
AUDIENCE: When you said that the landmark detection
[UNINTELLIGIBLE PHRASE]
you tried that with actual data sets
[UNINTELLIGIBLE PHRASE]
where most of the buildings are skyscrapers, or all
different kinds?
REMCO CHANG: Yeah.
So the question is, in terms of landmark detection, have we
done this in a scenario where all the buildings are
skyscrapers?
The answer is yes.
And the idea there is, because the system doesn't really care
about whether or not they're skyscrapers or not, the idea
is, you basically have a base cluster mesh.
Anything taller than that, based on your current view
position, would be shown on top of it, or overlaying it.
It doesn't really matter if they're all
skyscrapers or not.
They will still be rendered correctly.
AUDIENCE: [UNINTELLIGIBLE PHRASE].
REMCO CHANG: That's basically based on where
your eye point is.
So that's a very good question, though.
I can talk to you a little bit about this afterwards.
But the idea of projecting your negative space, and the
reason why the negative space is depicted as a 3D box, is
because you do want to know about the height.
So when you're looking at a tall cluster mesh from the
side, the projection of the negative space will still
project as something really large.
In which case, it will be broken down into smaller
sub-pieces.
Anything else?
Actually, just to follow up a little bit about that, there's
some interesting question about what is
considered as a landmark.
And that's a very open question.
If you ask a person who lives in the city, and you say,
well, what is a landmark around where you live, they
might pick out the local grocery store.
They might pick out the watering hole, the pub that
you usually go to.
Maybe because it probably has been there for 40 years,
everybody knows about.
But from a purely graphics and visualization perspective,
that information is not really as relevant, because they're
not visually important.
But if we're trying to do this system, for example, for the
purpose of a mental map, then what constitutes as a landmark
is not just purely geometry.
Then you really have to start incorporating other GIS data
sets to be able to figure some of those things out.
Any questions?
Any other questions?
Well, thank you very much.
I really appreciate it.
[APPLAUSE]