Abstract
1 IntroductionIn this paper is explained one of the concepts and technologies behind the Big Data, the Map Reduce, the ways that Hadoop data can be interrogated and then used. In Business Intelligence there are some forever issues with data computation, data transformation and analysis speed. Once the explosion in amount of data appeared, predictions and data mining are not separate disciplines. Therefore, customers need to be able to go beyond simple reports and see new ways to understand the data and detect trends and opportunities. When thinking of the advantages big data can bring, there are a lot, but only two of them are the more important. One of them is regarding the financial benefits and outcome and the second one is about the entire process flow, starting from the organization part to delivering part.Some organizations researching big data says that terabyte storage of structured data is currently most cheaply provided with big data technologies such as Hadoop clusters. For example for a company with the cost of storing one terabyte for a year was $37,000 for a traditional relational database, $5,000 for a database appliance, and $2,000 for a Hadoop cluster. Of course, these figures cannot be directly comparable, because the more traditional technologies may be somehow more safe and easily administrated [1].Big Data is a concept that promise to help in all that areas, using the three V's, volume, velocity and variety.> Volume: big data is that "Ocean of data" that we mentioned about in the rows above. It is represented by information that can came from every possible sensor, and some even say that we people are also sensors and data gatherers for big data. [9] The challenges of having such a big quantity of data is that is very hard to sustain it, to store it, to analyze it and ultimately to use it.> Velocity: is all about the speed of data traveling from one point to another and the speed of processing it. Sometimes it is crucial for the manager to be able to decide in a very little time on a variety of issues [2]. The most important issue is that the resources that analyses data is limited compared to the volume of data, but the requests of information is unlimited and usually information gets through at least one bottleneck.> Variety, the third characteristic is represented by the types of data that are stored. Because there are many types of sensors and sources, the data that came from them is vary very much in size and type. It is very complicated to analyze text, images and sounds in the same context and get a result that can be relied on. And then is the issue of dark data, data that sits in the organization and is unused and also is not free.> There are one new dimension that were added to the existing ones: Veracity Veracity is the hardest thing to achieve with big data, because due to the Volume of information and the variety of its type is hard to identify the useful and accurate data form the "dirty data". The biggest problem is that the "dirty data" can lead very easy to an avalanche of errors, incorrect results and can affect the Velocity attribute of Big Data. The main purpose of the Big Data can be corrupted and all the information can lead to a useless and very expensive Big Data environment if there is not a good cleaning team. The Veracity attribute is in its self also an objective for the Big Data developers. If the data cannot be accurate, is redundant or is unreliable, the whole Company can have a big problem, especial the companies that use big data to sell information like the marketing ones, or the ones that make market studies. Many social media responses to campaigns could be coming from a small number of disgruntled past employees or persons employed by competition to post negative comments.2MapReduce ModelMap reduce is a programming model of a concept which is used for generating and processing large data sets. The computation takes a set of output key/value pairs and has two levels of processing. …
Cite
CITATION STYLE
TRIFU, M. R., & IVAN, M.-L. (2016). Big Data Components for Business Process Optimization. Informatica Economica, 20(1/2016), 72–78. https://doi.org/10.12948/issn14531305/20.1.2016.07
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.