Name: 雷諾鑫｜火花峰會2017 (Reynold Xin | Spark Summit 2017)
Uploaded: 2021-01-14T08:06:17.000Z
Duration: 15 min 3 s
Description: 【看影片學英語】數萬部 YouTube 影片，搭配英漢字典即點即查，輕鬆掌握單字發音與用法，長久累積看電影不必再看字幕。

>> Welcome back we're here at
theCube at Spark Summit 2017.

I'm David Goad here with
George Gilbert, George.

Well here's the other
man of the hour here.

We just talked with Ali,
the CEO at Databricks

and co-founder at Databricks, Reynold Xin.

a lot of interesting
people with who I meet.

>> Well I know you're a really humble guy

but I had to ask Ali
what should I ask Reynold

Reynold is one of the biggest
contributors to Spark.

And you've been with us
for a long time right?

and lately more I'm
working with other people

>> Well let's get started
talking about some

maybe our audience at theCUBE hasn't heard

What are some of the most
exciting new developments?

>> So, I think in general
if we look at Spark,

there are three directions I
would say we doubling down.

One the first direction
is the deep learning.

Deep learning is extremely
hot and it's very capable

but as we alluded to
earlier in a blog post,

deep learning has reached
sort of a mass produced point

in which it shows tremendous
potential but the tools

And we are hoping to
democratize deep learning

and do what Spark did to
big data, to deep learning

with this new library called
deep learning pipelines.

deep learning libraries directly in Spark

and can actually expose models in sequel.

Streaming, again, I think
that a lot of customers

have aspirations to
actually shorten the latency

and increase the throughput in streaming.

So, the structured streaming
effort is going to be

I think out customers processed
three trillion records,

last month alone using
structured streaming.

And we also have a new
effort to actually push down

the latency all the way
to some millisecond range.

So, you can really do blazingly
fast streaming analytics.

And last but not least is the
SEQUEL Data Warehousing area,

Data warehousing I think
that it's a very mature area

from the outset of big data point of view,

but from a big data one
it's still pretty new

and there's a lot of use
cases that's popping up there.

And Spark with approaches like
the CBO and also impact here

we're actually substantially
improving the performance

and the capabilities of
data warehousing futures.

>> We're going to dig in to
some of those technologies here

But have you heard anything
here so far from anyone

that's changed your mind maybe
about what to focus on next?

So, one thing I've heard
from a few customers

So many of them are
fairly technical engineers

and some of them are less
sophisticated engineers

and they have written jobs and
sometimes the job runs slow.

And so the performance
engineer in me would think

The different way to
actually solve that problem

is how can we expose the right information

This is why my job is slow
and this how I can tweak it

you actually give them the tools to fish.

字幕列表影片播放

雷諾鑫｜火花峰會2017 (Reynold Xin | Spark Summit 2017)

sort

awesome

tremendous

effort