Big Data is currently a word that is flying through the industry with many organizations pondering its value, virtues and implementation strategies to harness the benefits. Countless organizations’ understanding is that this data, when tapped into accurately, can offer a host of benefits including market intelligence and competitive advantage. However, one challenge many organizations still stand against is grasping what big data really is and where to start to take advantage of it.
Nonetheless, there are still a number of myths surrounding big data and distinguishing myth from reality is the first step. Some of these myths include:
– Big data is about external data: Experts extrapolate that most organizations only analyze 12 percent of their internal data. Big data is about making the data you have work for you in extraordinary ways. External data sources may be included to bring more insight but this is not the main focal point for many organizations.
– Big data is about size: On average, an Enterprise Data Warehouse (EDW) houses about 15 Terabytes (TB) of data. Additionally, an average Hadoop installation increases from 150TB to 200TB. Therefore, big data is bigger than your existing data warehouse however, only by an order of magnitude. Some entities such as the government store yottabytes and exabytes of information.
– Big data means in-memory (in-memory enables users to have immediate access to right information): Shifting your existing Search and Query Language (SQL) processing to in-memory databases will reduce processing times however, this does not allow you to gain new insights, though, but it helps you save time in acquiring the correct info and in so doing the information can be extrapolated to behold novel insights, but the former supposition still holds-in-memory only helps you get the right information, what happens next or how that information is used depends on the organization. This data allows us to answer new questions without depending on structured schemas – whether in memory or not. Although in-memory is great for high speed decision-making on data sets, big data analytics are often batch in nature.
– We can use our existing technology: When relational databases entered the main stream about two decades ago, developers had to learn new skills, such as SQL. The big data revolution will also require people to acquaint themselves with new skills if we are to take full advantage of the benefits of unstructured data sets. SQL is not a long term solution to exploit big data.
– Big data is about Hadoop. Although a range of technologies and platforms can be used both to support and implement big data analytics, Hadoop is the preferred platform of over 80% of big data implementations. Unlike the others, this myth is largely true – any serious big data analytics solution is likely to be based on Hadoop. Hadoop addresses key big data challenges allowing you to consolidate both structured and unstructured data quickly and provides cheap, scalable storage and analytics power. (Hadoop is an open-source software framework for storing and processing big data in a distributed fashion n large clusters of commodity hardware. Essentially, it accomplishes two tasks-massive data storage and faster processing: Source: SAS.)
– Big data is free: Although infrastructure costs are relatively low, the labor costs are relatively high. Companies that reduce their dependency on big development teams – by exploiting easier to use platforms and supporting self-service – will get insights quicker, stay ahead of their competitors, and have a lower total cost over those that spend years on reinventing code themselves.
– Big data is an IT problem: Big data must have clear business goals to ensure intelligence is harnessed – this is not an IT problem but rather a business issue. Big data is about finding the answers to critical questions about your customers, your channels, your markets and your products that can differentiate you from your competition, improve your customer’s experience with you, and make you money.
– Big data is too complicated – Big data requires new skills and approaches. As yet, modern self-service data discovery platforms, such as Datameer, exist to shield you from the technical complexity and allow you to deliver quickly and easily.
– We need a data scientist: Most companies will be able to deliver big data analytics using a team of existing staff – business analysts, business managers and statisticians that collaborate, using a shared platform, to design and deploy appropriate analytics. The clincher here is to focus on the business problem, not complex technology.
– No more data quality problems: More data will typically mean more data quality problems. Big data must be managed and assessed for data quality just as any other data is – if accurate and trusted insight is to be achieved.
– Big data is just ephemeral or a hype that will pass away: Big data is in its infancy in East Africa, colleagues and competitors in Europe and the USA are delivering real value – in areas such as fault management, customer experience management, market driven pricing, compliance projects, fraud management and much more. Big data is not just a fad; it is a vogue that is here to stay. Companies that act first could build an unassailable advantage over their slower moving competitors.
Clearly defining and understanding the common big data myths will allow an organization to fully comprehend where to begin.
A singular challenge that an organization will face, especially with regards to budgets, is whether to buy or build a solution. Building big data analytics using open source code could take a year or more, require scarce (and expensive) skills and may allow versatile rivals a critical head start. Packaged platforms can allow you to deliver insight in a matter of weeks, with minimal training and support for your existing people. Of course, an organization is not limited in developing both situations.
Big data is not as daunting as most assume it is, yet many organizations ‘drag their feet’ due to the misconceptions and misunderstandings surrounding big data. It can offer a wealth of information and value to an organization, giving them competitive advantage and ameliorating customer relationships. It is recommended that you opt for a solution that is already available such as Hadoop that offer brisk implementation and offers real value in a short period of time.