Architecture Modernization with Cloudera
Essay by frank.vullers • April 19, 2017 • Course Note • 1,468 Words (6 Pages) • 962 Views
[pic 1]
[pic 2]
[pic 3][pic 4][pic 5][pic 6]
Architecture modernization
Dear Attendees
Architecture modernization
Who doesn’t want want to work with the latest architecture ?
But who knows what this is ?
After our presentation you will know better why you should modernize your architecture, are you more aware of differetent tools available and .. and will you get a better insight in things to do
http://jalopnik.com/the-tesla-model-s-raises-the-suspension-based-on-locati-1636877061
Bob, Jim Here . we got a lot of complaints about over damaged Battery covers and scratches at the bottom of the car
Could your team have a look at it ? I Don’t wont to a recall
Next day
Jim, we found some strange stuff. Every car that has a damaged cover has unusual sensor activity on certain days and locations. Especially the air suspensions and shock absorbers reacted heavily
We looked deeper into the location data and found that a lot times it happened on off road tracks.
The idea of my team is to lift the car automically 1. 2 inch when the sensors react heavily.
We can build new air suspensions settings into the next software release
My Name is Frank Vullers , I am a Business Value Consultant for Cloudera. I helps customers with finding use cases (discovery workshops) and calculating the benefits (value assessments).
Our relationship with data is changing. In the old days was data the result of an action. Now data it self can be an action
TRANSITION: Lets have have a look at how data is used
Evolution in use of data
We use a lot of data in traditional Business Intelligence. Daily reports etc
In todays world more and different data is now available. With Big Data Analytics we have new possibilities
The world goes faster and faster and also BI needs to react. Fast Data Analytics is the answer to this
In the next slides I would like to go deeper on these 3 areas
Traditional BI
In the classic bI world , the business starts with asking questions , IT structures the data to answer the questions.
Data is captured only what is needed
The traditional Analytics like statistical analysis of or segmentation is used to come with answers
Who are our customers? What do my customers segments spend etc
Big Data Analytics
Big data analytics works in a different way. All data is captured in case it’s needed, multistructured. Business explores the data to find questions worth answering
Who will be our customers in 6 months and where will they come from?
In this world new analytics is used like
- Path analysis (eg to see how customers move from gold to silver to bronze customer card over time
- Text analytics; finding out how people react on your commercials via reading tweets
- Graph Analytics eg find which customer is most influential and should be rewarded
- Map reduce
Fast Data Analytics
More and more we want to act on real time/ near real time events. First we need to do the analysis in the given time frame, next we have to react.
Examples of this are
Event Detection
- Fraud/Risk Detection
- Spam Filter
- Marketing Alerts
Recommendation Engine
- Next Best Offer
- Content and/or Services Recommendation
Model scoring
- Embedded Analytics
- Analytic Aggregates
- Reports
TRANSITION: Now we have seen the changes in Data usage it is time to switch to architectural views
Architecture View
Lets go to the architecture view
Schema on read is the change agent
An important aspect of the enterprise data hub is support for both 'schema on write' and 'schema on read' in order to handle routine and exploratory workloads.
- Schema on write (as with traditional databases) provides good performance as it is possible to lay out the data efficiently, as well as good governance.
- Schema on read allows users to store any data as the system looks more like a file system than a database. It effectively performs ETL (extract, transform, load) on the fly at read time, generating the appropriate schema as part of the process. This means an additional column of data can be provided for analysis very quickly.
The logical architecture hasn’t changed
We see here the logical architecture of Datawarehousing . Ralph Kimball talked about it in the seminar The Future of data ware housing. ETL will never be the same.
I don’t want to go into the details of this architecture but highlight the fundamental differences.
Ralph talks about the EDW backroom and the EDW frontroom
We see data coming in from the orginal source systems, being processed in the ETL step
, Data exposed into the the presentation layer and the BI applications
BUT, the physical architecture of the back room now looks very different
When we bring our Enterprise Data Hub into the game we bring in HDFS files and most imprortant Schema on read
Old Backroom
...
...