OtherPapers.com - Other Term Papers and Free Essays
Search

Big Data and Its Potentials

Essay by   •  November 12, 2013  •  Research Paper  •  3,839 Words (16 Pages)  •  1,971 Views

Essay Preview: Big Data and Its Potentials

Report this essay
Page 1 of 16

Big Data and Its Potentials

Data exists everywhere nowadays. It flows to every area of the economy and plays an important role in the decision-making process. Indeed, "businesses, industries, governments, universities, scientists, consumers, and nonprofits are generating data at unprecedented levels and at an incredible pace" to ensure the accuracy and reliability of their data-driven decisions (Gordon-Murnane 30). Especially when technology and economy are growing at an unbelievable speed, data volume is increasing at a faster rate than one would have expected. In particular, the widespread "accessibility, affordability, and availability of new digital devices that make access to the internet easy" produces the huge amount of data that traditional databases are not capable of storing (Gordon-Murnane 30). That urges the introduction of "Big Data" concept, referring to "datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze" (Bughin 11). Realizing the importance of big data in the current global economy, this paper introduces the characteristics, opportunities and challenges of big data, its business models as well as its applications in different fields of the economy.

Definitively, big data is data that "exceeds the processing capacity of conventional database systems" (Slocum Ch.2, location 31). Businesses turn raw data into useful and informative data that provides insights and helps with business strategies. When the data is too big and does not fit the structures of the company's database architectures, companies need to find other ways to process and gain value from it. That is why public and private sectors are now taking advantage of big data analytics. Generally, big data is commonly characterized by three V's: volume, velocity and variety (Slocum Ch.2, location 46).

As mentioned earlier, the vast amount of data flows from consumers every day through "email, searching, browsing, blogging, tweeting, buying, sharing, and texting" (Gordon-Murnane 30). Even though the exponential growth of data volume is an important trait of big data, there is no exact definition of the minimum number of terabytes for data to be considered as big data because that number changes with the advance of technology.

Importantly, it is misleading to understand big data solely in terms of size. In fact, big data does not simply have bigger volume; it is more complex than that. The rate at which data flows into an organization has an important role in defining big data. For instance, online retailers "are able to compile large histories of customers' every click and interaction: not just the final sales" (Slocum Ch.2, location 86). They utilize the data flowing back from consumers to understand what their customers want and thus make decisions on different parts of business, including new products, new marketing strategies, etc. (Gordon-Murnane 31). The velocity of incoming data is very important to organizations since the faster the data inflows, the faster they make decisions and thus gain competitive advantages over their competitors. Also, the speed at which input is transformed to decision is crucial to businesses for the same reason.

It is not only the increased amount of data that matters; the variety in types of data is also significant. Data can come from various and diverse sources. Indeed, with the development of digital world, including social networks, emails, and smart phones, data does not inflow in a particular order or structure. Instead, it can flow into organizations in forms of a line of text, an email, an image, or even a status update or a comment on social networking sites. "Different browsers send different data," and different users might be using different software and tools to communicate (Slocum Ch.2, location 114), so the conventional way to store simple type of data is not capable of dealing with this complexity, creating the difficulty for collecting informative data. Therefore, big data processing is used to take various types of unstructured data and turn it into meaningful information. Also, big data analytics allows users to keep all in the information and not have to cut any part of it because there might be some useful details in the pieces that are thrown away. Indeed, this indicates an important principle of big data: "when you can, keep everything" (Slocum Ch.2, location 120).

This explosion of data is relatively new. Around year 2000, "only one-quarter of all the world's stored information was digital," and the remaining part was still on "paper, film and other analog media" (Cukier). Nowadays, "less than two percent of all stored information is nondigital" (Cukier). Big data was triggered to start and expand by big companies like Google, Yahoo, Amazon, Facebook, and Twitter; they produce tremendous amount of "clickstream data that is only valuable if it is collected and analyzed" (Lamont). This made the traditional Web analytics usages obsolete and insufficient, driving the expanding of big data analytics for handling the complex and massive data processes. Those companies are notable examples of big data used "as an enabler of new products and services." For instance, Facebook has been able to create a personalized user interface and tailored advertising by combining a huge amount of signals from a user's activities as well as those of their friends (Slocum Ch.2, location 46).

With the explosion of big data, Apache Hadoop was introduced to as a tool to handle processes that conventional relational databases could not cope with because ultimately, organizations would not want to throw away any data just because it does not fit; they do not know for sure if they will need these pieces of data in the future to support their decision making process. Hadoop is an open-source, "Java-based programming framework" that supports the distributed processing of large datasets across servers. Indeed, Hadoop introduces a cheap, new way to store and process huge amounts of data, regardless of how big, complex and unstructured the data is, or how many types of data there are. Unlike conventional distributed databases which require schemas for data and can only process structured data, Hadoop can deal with data of any format and any size, and that helps Hadoop drive the force behind the fast growth of big data nowadays (Slocum Ch.2, location 180).

The development of Apache Hadoop was originally derived from Google's MapReduce and Google File Systems. MapReduce is Google's patented programming framework and is used to reliably store a vast amount of data easily and inexpensively. It is able to take a query "over a dataset, divide

...

...

Download as:   txt (24 Kb)   pdf (237.8 Kb)   docx (17.6 Kb)  
Continue for 15 more pages »
Only available on OtherPapers.com